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ABSTRACT 

The goal of efficient management of an organization's 
information casoucce can be accomplished through the imple- 
mentation and use of a data dictionary. This thesis defines 
the structure and functions of a data dirtionary and 
analyzes the attempt of the National Bureau of Standards to 
promulgate a standard software specif ication for use in the 
evaluation and selection of data dictionaries in the federal 
government. Criteria for the "ideal" data dictionary are 
developed based on the role a dictionary can play in infor- 
mation resource management and are then used to evaluate 
four commercial data dictionary packages. Finally, some 
ideas concerning possible applications for data dictionary 
technology are presented. 
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I. THE NEED FDR A DATA DICTIONARY 
A. BACKGROUND 

One of the most important resources of an organization 
ani one that is too often overlooked is data. People, 
dollars, materials, and time are usually well controlled and 
budgeted, yet the data about an organization and its opera- 
tions is often managed haphazardly, if at all. 

Database technology has made possible the storage and 
processing of an organization's data as an integrated whole 
and allows the sharing of that processed data, or informa- 
tion, throughout the organization. A database management 
system (DBMS) acts as a librarian for the database, storing 
and retrieving data according to a particular format 
[Ref. 1]. However, a DBMS does not necessarily provide for 
the security, integrity, accountability, or maintainability 
of data. These objectives are best achieved when a data 
dictionary is used in conjunction with the DBMS. 

Simply stated, a data dictionary is a central repository 
of descriptive data about the definition, characteristics, 
location, and usage of the data found in aa organization. A 
fully utilized data dictionary will control the collection, 
maintenance, and retrieval of this data. For example, if 
the aircraft carrier O.S.S. Constellation aad a data 

dictionary, it would be possible to ask questions such as 

S?hat type of data is contained in a "Controlled 
Equipage" record? 

How many programs use the "Personnel" file? 

ifhich departments receive the "Ammunition Transaction" 
re port? 

ifhat is the relationship between "Inventory Item" and 
"Reorder Point"? 



9 



rn which records is the field "Social Security Number" 
found? 

Who is authorized, to update the "Readiness Status" 
field? 

ihat is the range of values for ’Readiness Status" data? 

In which database is the "Preventive Maintenance" file 
f ound? 

Those who will benefit from the answers to tiese questions 
include not only the ship’s data administrator, but also 
programmers, systems development personnel, data processing 
staff, auditors, and, most important, end users at every 
level of the organization. 

Even though data dictionary software has been available 
commercially since 1970 and the advantages and benefits 
associated with data dictionaries are widely recognized, 
most organizations have been slow to implement them, and the 
Department of Defense is no exception. A recent study by 
the Committee on Review of Navy Long-Raage Automatic Data 
Processing Planning [Ref. 2] points out that 

virtually every action by a commander, manager, or 
administrator in the Navy, as in any large organization, 
involves the acquisition and understanding of informa- 
tion: information about the organization, about its 
status, about its resources, about its environment. His 
actions usually result in the creation and promulgation 
of policies ana directives: that is, information for 
subordinates, peers, or superiors. 

If it is true that "the benefit derived from a dictionary is 
proportional to the size of the dictionary itself," [Ref. 3] 
the military stands to gain a great deal from the implemen- 
tation of data dictionaries. 

At present, there is no consensus in computing litera- 
ture about exactly what a data dictionary should do or what 
kind of data dictionary is best for a particular organiza- 
tion. There ace many different data dictionary packages on 
the market from which to choose; most of these have similar 
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features. Tier er ore, the potential purchaser of a lata 
dirtionary is in need of guidance whan making this choice. 
Tha United States Government has racognizad this problem and 
has identified standards for data dictionaries in Federal 
Information Processing Standards promulgated oy the National 
Buceau of Standards. An understanding of these standards 
and of the functions and objectives of a data dictionary 
will provide the reader with a basis on which to evaluate 
data dictionary packages and to usa them effectively. 



B. 



PURPOSE OF THE THESIS 



We believe that it is important for managers in the 
military to understand what a data dictionary is and what it 
can do to help an organization manage its data. Thus, the 
pucpose of this thesis is to provide the reader with an 
understanding of the structure and functions of a data 
dictionary, guidelines for the evaluation and selection of a 
data dictionary, and an analysis of several conmercial data 
dictionary products. We will show the reader how the 
management of an organization’s data resource can be accom- 
plished by means of a data dictionary and will recommend 
ways for the cole of the data dictionary to be expanded. 
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II. THE ANATOMY OF A DATA DICTIONARY 
A. INTRODUCTION 

Because data dictionary technology is a new and continu- 
ally evolving field, it suffers from a lack of consistency 
in its terminology. The many texts arid articles on the 
subject and the various commercial data dictionary products 
use a wide variety of differing terms. The data dictionary 
itself is known as a data dictionary /directory , a data 
dictionary system, or an information resource management 
dictionary. In order to provide a base of reference for the 
remainder of this thesis, we will present our own set of 
definitions distilled from our references. 

Data dictionaries run the gamut from manual, on-paper 
systems to highly sophisticated software and can be used 
both in database and non-database environments. He will 
discuss automated data dictionaries only as they relate to a 
database, where they have the most to offer the potential 
user . 

In order to assess the benefits of a data dictionary, it 
is necessary to understand how a data dictionary is orga- 
nized and what its capabilities are. A data dictionary does 
not contain the actual data that constitutes an organiza- 
tion's database; instead, it is itself a dataoase called a 
metada tab ase that contains metadata, or data about the data- 
base data. Two types of metadata are found in a data 
dictionary. Diction ary metadata tells what data exists, the 
origins of the data, the attributes the data may have, how 
and by whom the data may be used, what the structure of the 
data is, and what the relationships between the data are. 
Dicector^ metadata tells where the data is located, how it 
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caa be accessed, and what its physical representation within 
the computer is. Together, these two types of metadata 






r 


DICTIDNARI 


| DIRESIDRY 


METADATA 


| METADATA 



Data Present 
Data Origin 
Attributes 
Security/Access 
Data Structure 
Bela tionships 



- Data Location 

- Access Modes 

- Physical 
Representation 



Figure 2.1 Types of Data Dictionary Metadata 



provide the means for accessing and controlling the data in 
the database. Figure 2.1 illustrates this division of 
me taaata . 

Data dictionaries fall into two categ ories--free- 
standing and DBMS-dependent. Figure 2.2 shows a partial 
listing of some commercial data dictionary packages 
according to type. A free-st an ding data dictionary (also 
called independent or stand-alone) is not tied to any 
particular database management system (DBMS) . It manages 
data by utilizing software routines built into the data 
dictionary package and thus is not dependent on DBMS soft- 
ware. This independence provides flexibility: a free- 
standing data dictionary can have the capability to support 
more than one type of DBMS. However, this flexibility is 
gained at the cost of duplication of data descriptions in 
the database and the data dictionary. 
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Free-standing Data Dictionaries 



DATA CATAL030 2 2 M974) 

- Svnergetus Corporation 
DATA DESIGNER (1975} 

- Database Design, Inc. 

PRIDE-L03IK (1974) 

- M. Bryce & Associates, Inc. 
DATAMANAGER (1975) 

- Management Systems & Programming, LTD 
DBMS-Dependent Data Dictionaries 



ADABAS (1978) 

- Software AG of North America, Inc. 
DATA DICTION ARY/DAT ACOM ( 1979) 

- Applied Data Research (ADR) 

ORACLE (1983) 

- Relational Software, Inc. 

DB/DC DATA DICTIONARY (1974) 

- International Business Machines 
EDICT (1976) 

- infodata Systems, Inc. 



Figure 2.2 Free-standing and Dependent Data Dictionaries 



A DB^S-dependent data dictionary (also called merged or 
integrated) is a component of a specific database management 
system; it uses the software facilities available within the 
DBMS to manage the data in the database. This type of data 
dictionary minimizes redundancy and limits tie number of 
possible errors because data descriptions exist in only one 
place, in the data dictionary. It also benefits from the 
sophisticated backup and recovery facilities of the DBMS. 

A data dictionary is also described as having active or 
passive interfaces or a combination of the two. An inter- 
face is a secies of commands which connect the data 
dictionary with other software such as compilers, operating 
systems, report generators, and other programs. The data 
dictionary supports these applications by providing the 
metadata that is required for their execution. An act ive 
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data dictionary is one in which information Is created, 
accessed, or modified through the data dictionary inter- 
faces. New or changed metadata is automatically updated and 
stored in the data dictionary. This is not true of a 
E!i§iZ§ data dictionary: when new metadata is generated, 
the data dictionary may or may not be automatically updated 
and when data is retrieved, it may be accessed through the 
data dictionary or directly from the database. 

There are many perspectives from which to look at the 
data that resides in a database. There is the physical (or 
internal) view that consists of the actual physical repre- 
sentation, format, and location of the data as "seen" by the 
computer. There is a logical (or conceptual or global 
enterprise) view called a schema which describes all of the 
data in the database in its logical format, i.e., what types 
of records are to be maintained, the contents of those 
records, and the relationships anoag those records. This is 
the data as it would be presented to a human, not its actual 
computer format. In most cases, only the database adminis- 
trator has access to the schema. Another view is the 
external view, also called a subschema, which is a subset 
of the logical view tailored to a particular user or appli- 
cation. This is analogous to a "window" through which only 
a portion of the total data is seen. Subschemas can be 
utilized to implement security oy restricting a user's 
access to data. 

Figure 2.3 shows the three different perspectives of 
data in a sample database of students at the Naval 
Postgraduate School. (A) is the computer's physical view 
and thus is not visible to the human user. (3) shows the 
overall logical view of this smaLl database. (Z) is a 
subset of (B) as it would be seen by a user who is 
interested in only a portion of the database--in this case, 
the senior Army officer who wants information only on Army 
students. 
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Physical View as 'seen* 
within the Computer 



(A) 





Logical 


View of Stored 


Data 




NAME 




SSN 


SERVICE 


RANK 


MARKEY 


, Ronald P. 


452-43-6029 


OSA 


0-5 


JOHNSON, Bruce M. 


348-57-8826 


OS N 


0-4 


BROWN, 


Jennifer C. 


512-47-2228 


OS NR 


0-2 


1 DAVIS, 


Thomas E. 


662-76-8239 


OS AF 


0-3 


MASON, 


Robert J. 


823-48-3991 


OSA 


0-3 


GEIB, 


Thomas W . 


773-34-8725 


OSH 


0-4 


LANE, 


Donna F. 


371-67-7476 


OS NR 


0-3 | 


WILLIAMS, 3uy T. 


547-23-3410 


OSA 


0-3 



(B) 



One External View of the Data 
(subset of the logical view) 



NAME 


SSN 


SANK 


MARKET, Ronald P. 


462-43-5028 


0-5 


MASON, Robert J. 


823-48-3991 


0-3 


WILLIAMS, Guy T. 


547-23-3410 


0-3 



(C) 



Figure 2.3 Views Within a DBMS 



B. THE STRUCTORE OF A DATA DICTIONARY 

There are three kinds of elements upon whirh the struc- 
ture, or schema, of a data dictionary is built: entities. 
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attributes, and relationships. The basic element of the 
dictionary is the entity . Each entity has a unique name and 
represents an object in the real world, such as a person, 
thing, or idea about which information is recorded. For 
example, in our Naval Postgraduate School database we 
collected information about students. He also described the 
students by name. Social Security number, service, and rank. 
These characteristics of an entity are called attrib ut es, 
and can be either quantitative or qualitative. 

A celat ionsh ip is a logical link between two entities 
that can also be described by attributes. A relationship 
will fall into one of three categories of mappings: 2H®I.to- 
one, one-to-man y / many.- to- one , or many- to- many. A one-to- 
one relationship exists when eaca entity or attribute is 
logically linked to one and only one other entity or attri- 
bute. For instance, we say that there is a one-to-one rela- 
tionship between an individual’s social security number and 
his name. In a one-to-many/many-to-one relationship, each 
entity or attribute is logically linked to one or more other 
entities or attributes. An example of this is the relation- 
ship between the instructor of a class and the students in 
that class. A many-to-many relationship occurs when one or 
more entities or attributes is related to one or more other 
entities or attributes. For example, there is a many-to- 
many relationship between the attributes '’color" and ’’model 1 ’ 
of a type of car--each color may be available on many 
different car models and each car model may be available in 
many different colors. 

In order to understand the generic terms we have 
presented in their proper context, it is important to 
differentiate between the dictionary schema itself, the 
metadatabase that it governs, and the "real" data in the 
organization’s database. These concepts are made even more 
confusing because the terminology used to refer to these 
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thcee levels of data differs from veador to vendor and from 
author to author. We will look at these levels using the 
Applied Data Research, Inc. DATADICTIONARY terminology 
[Ref. 4] because it provides the clearest distinction 
between the three. (DAIADICTIONARY will be discussed in 
depth in Chapter 7.) 

At the highest level of abstraction, entities, attri- 
butes, and relationships ace grouped by type: 

the dictionary schema can then be thought of as 
containing all existing entity-types, relationsn ip- 
types, and at tcibute-types, any one of which will also 
be referred to as a schema descriptor [Ref. 5]. 

The schema descriptors are the general categories of data 
that is stored in the metadatabase. Figure 2.4 shows exam- 
ples of some standard schema descriptors. 



E nt ity-types 

File 

Record 

Field 

Module 

Program 

Report 

Job 

Dataview 

User 

System 

Process 



A ttr ibute-types 

Author 
D escription 
Password 
Status 
7 ersion 
Frequency 
Security Class 
A lias 
Comment 

Effective Date 
Osage Statistics 



Esiltionsn i p^t^pes 

Contains 

Owns 

Processes 
Derived From 
Resides 
Uses 

Includes 

Authority 

Accesses 



Figure 2.4 Sample Schama Descriptors 



At the metadatabase level, we look at specific instances 
of schema descriptors. Thus, we define an entity -o cc ur re nce 
as a specific instance of the general category entity-type. 
If PROGRAM is the entity-type, ACCOUNTS RECEIVABLE could be 
one entity-occurrence. Similarly, a relationship-occurrence 
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is a specific instance of the general category relationship- 
type. The relationship- type ACCESS may have as a 
relationship-occurrence P83SRAM-&CCSSSES-FILE. At this 
level, we also talk about the specific characteristics of an 
attribute-type. An attribute-type is the name of a charac- 
teristic of an entity-occurrence, as Social Security Number 
characterizes a student. An attribute-characteristic is not 
the value of the at tribute- type , but the parameters of an 
attribute-type, such as its length and format. For example, 
the attribute- type Social Security Number will be character- 
ized as eleven digits long, of the form 999-99-9999. 
Entity-occurrences, relationship-occurrences , and attribute- 
characteristics will be referred to as the descriptors of 
the metadatabase. 

At the "real” data level of the organization's database, 
we think in terms of actual values of data, such as 
"Jennifer C. 3rown", ”547-23-3410”, "left-ianded monkey 
wrench", "IBM 3033”, or ”93943". These are all values of 
the attributes of an entity, and are called attribute- 
values. 

An example of each of the levels of data is given in 
Figure 2.5. We will use the generic terms entity, attri- 
bute, and relationship in this thesis where it is not neces- 
sary to distinguish between the three levels. 

When a data dictionary is received from tie vendor, it 
contains a system st andard schema which includes certain 
basic entity-t ypes, attribute-types, and relati onship- types 
chosen by the vendor. A data dictionary is extensible if an 
organization is able to customize the schema by defining its 
own entity-types, attribute-types, and relationship- types in 
addition to those included in the system standard schema. 



19 



1 



Schema Descripto rs 

Entity- type 
Attribute-ty pa 
Relationship- type 

Metadatabase Des crip tors 

Entity- occurrence 
A ttricute-charac ter is tic 

Relationship- occurrence 

Da ta base Data 
Attribute- value 



Exam pL e 

Record 

Name 

Contains 

Example 

Student 

25 characters, alpha 
numeric 

Student-Dontains- 

Name 

Example 

Ronald P. Mackey 



i 



j 



Figure 2„ 5 



Comparison of Data Levels 
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THE FUNCTIONS OF A DATA 



DICTIONARY 



The functions performed by a typical data dictionary 
fall into four categories: definition, update, retrieval, 
and software interface. A data dictionary should be evalu- 
ated in each category according to the ease and success with 
which the functions are performed. 
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system standard schema, or the dictionary administrator may 
use customized data types as necessary, assuming the 
dictionary is extensible. 

2 . Update 

As an organization evolves, so does its data. One 
of the functions of the data dictionary is to allow the 
addition, modification, and deletion of elements. For 
instance, a new Navy regulation might reguire the supply 
department to keep track of certain data about a new inven- 
tory item and to report this data guarterly. Or perhaps the 
administrative department will have to change zip codes to 
the new nine-digit format on all correspondence. Each of 
these changes will be introduced via modifications to the 
dictionary schema. 

3. Re trieval 

Information can be retrieved from a data dictionary 
by using query language commands or the report-generating 
capability of the dictionary. \ dictionary will provide 
structured commands or an English-like query language that 
will help the supply department to find out tae Navy part 
number for a monkey wrench. It will also allow the 
dictionary administrator to find out which users have access 
to a particular subschema. Reports are produced fcy a data 
dictionary according to a vendor-defined format or to user 
specifications. Reports generally produce a larger volume 
response than queries and are often printed out in hard 
copy. 

4. Software Interface 

The software interface function provides a means of 
access to the data dictionary for applications software, 
including compilers, editors, and database management 
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systems. A Z09 Y command Ls used to Dring data descriptions 
(e.g., of records or fiiesi directly into the program being 
developed from the data dictionary. Thus, tie job of the 
programmer is made easier and data use is standardized. It 
is also possible for applications software to directly 
retrieve and make changes to the elements in a data 
dictionary. 
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III. federal INFORMATION PROCESSING standard foe data 

DICTIDNARI SYSTEMS 



A. INTRODUCTION 

The Institute for Computer Sciences and Technology of 
the National Bureau of Standards is in the process of devel- 
oping a standard software specification for data diction- 
aries. The Federal Information Processing Standard for Data 
Systems (FIPS DDS) is intended to serve as a 
guideline for the evaluation and selection of data diction- 
aries to be used by the federal government. The four 
volumes^ "specify and describe the functionality, database 
structure, and user interfaces of the FIPS DDS" [Hef. 6], 

We examined three volumes of the FIPS DDS: Command 

Laaguage Interface Specif ications (volume 2 ) , Interactive 
Interface Descriptions (volume 3* , and Dictionary 
Administrator Support Specifications (volume 4). The 
subject of each of the volumes corresponds to one of the 
three categories of users who will interact with a data 
dictionary — the experienced user, the relatively inexperi- 
enced user, and the administrator of the data dictionary. 

The FIPS DDS describes in detail a suggested system 
standard schema for a data dictionary, including definitions 
and use of the schema descriptors. Each of the volumes 
presents the syntax for commands necessary for its target 
users to manipulate the dictionary. In addition, the 
results of each command ace detailed, with error messages 
and "successful completion" messages listed where 
applicable. 



l Note: Volume 1 is not yet available foe review. The 
FIPS DDS is in draft form and has not been formally approved 
by the National Bureau of Standards. 
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B. SYSTEM STANDARD SCHEMA 

The system standard schema se 
provides basic entity- types, 
relationship-types as follows: 

E ntit y-types 

1. SYSTEM --a collection of pro 

2. PROGRAM--an automated prcce 

3. MODULE — an' automated proc 
subdivision of a PROGRAM o 
called by a PROGRAM 

4. FILE — an organization’s dat 

5. RECORD--logically associate 
the organization 

6. DOCUMENT — human-readable da 

7. ELEMENT — data belonging to 

8. USER — members or collection 
the organization using the 
the data dictionary 

9. DICTIONARY-USER --users of 
itself 

10. ACC ESS-C ON T ROLLER-- s pecif ia 
an entity or set of entitie 

SYSTEM, PROGRAM, and MODULE ar 
FILE, RECORD, DOCUMENT, and ELEMENT 
USER is classed as "External", 
ACCESS-CONTROLLER are of the class 

A ttrib u te - types 

There are 55 attribute-types 
standard schema, similar to rhe one 

Rel at ionship- typ es 

The standard relationsni p- types 
follows : 



t forth in t 
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1. CONTAI NS-- describes entities composed conceptually 
of other entities 

2. PROCESSES--shows the relationship between a process 
and data 

3. RESPDMSI3LE-F0R-- shows the association between enti- 
ties representing organizational components and 
entities denoting organizational responsibility 

4. RUNS--shows the relationship between a user and a 
process 

5. TO — shows the flow between two processes 

6. DERIVED-FRDM — shows that an entity is the result of 
some operation on another entity 

7. The FIPS DOS includes an extensibility facility to 
provide for the customization of the system standard 
schema to match the organization’s needs. 

C. COSHAND LANGUAGE INTERFACE S PECIFICATIONS 

The experienced user is one who is familiar with the 
structure and commands of the data dictionary and who needs 
access to the full functionality of the data dictionary. 
Command language commands are used to facilitate this access 
by allowing the user to: 

--define data elements 

— maintain the dictionary (add/modify/delete) 

--report on dictionary elements 
— query the dictionary about data elements 
--build entity lists aad perfocm operations on groupings 
of entities that meet certain criteria (useful for global, 
vice individual, operations) 

--support applications programs that intecact with the 
data dictionary 

— perform general utilities, such as changing the mode 
of operation and obtaining help information. 
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The syntax of each of the command language commands is 
presented in the FIPS DOS using Backus-Naur form. 2 For 
example, the following command would be used to modify an 
entity that already exists in the dictionary: 

KDDIFY-ENTITY 

{[WHERE] NAME [IS] <entit y-name> 

[ADD NEW-VERSIDN [ < ver si on-nu mb er > ] ] 

WHERE ATTRIBUTES [ARE] Cattribu te-clause-1 > 

[,...., ] <attribute-cla use-n> ] ]} 
where : 

— entity-name refers to a single entity in the 
diet ionary 

— NEW-VERSIDN is an optional clause which results in the 
creation of a new entity which has a primary-name consisting 
of the assigned-name of the entity-name specified and the 
next-highest version-number 

--attribute-clause-n refers to a clause used to desig- 
nate the attributes of the specified entity which are to be 
modified 

D. INTERACTIVE INTERFACE S PECIFIC ATIDNS 

The interactive interface for the relatively inexperi- 
enced user is designed to lead the user step-by-step through 
the desired operations. Without having to master the 
command language commands, the interactive interface user 
has a large subset of the total functionality available 
within the data dictionary, including manipulation, 
reporting, querying, and entity list operations. The FIPS 
DD 5 recommends that this interface be implemented by means 
of "panels" (screens) that are presented to the user in 
sequence and which contain the following information areas: 



2 Backus-Naur form is explained in Appendix A. 
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1. state area — tells the user where (in which 

dictionary) he is and what he is doing 

2. data area--for entering an! displaying data 

3. schema araa--used mostly for dictionary updates to 
show available options and limitations on actions 

4. message area — for error messages and warnings 

5. action area--tells the user how to proceed from the 
current panel 

6. help area--for the display of help information 
requested by the user 

The user begins his session with the data diet ion ary at 
a '’home panel 1 * which provides entry into the system. At any 
point along the way he has the option of saving or undoing 
any panel with which he has been working. This panel-driven 
interface ensures that the user always knows where he is in 
the dictionary, what mistakes he has made, what choices he 
has to continue, and what help is available to him. 

E. DICTIONARI ADSINISTRATOR SUPP3RT SPECIFICATIONS 

The administrator of the data dictionary, of course, has 
access to both the standard command language and the inter- 
active interface. His or her main concern, however, is the 
management of the schema. This is accomplished by means of 
a specialized set of commands for 

--extending the system standard schema 

--reporting on the schema 

--implementing access control measures 

— controlling export from and import to the dictionary. 

We have already defined the extensibility facility as 
the ability to add schema descriptors to the system standard 
schema. The report facility allows the administrator to 
generate a listing of the entire schema or any subset 
thereof. The security facility provides commands for 
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restricting the access of users to the dictionary by speci- 
fying which commands the user is allowed to execute. The 
export/import facility allows transfer of parts of one 
dictionary to another, but only between dictionaries wh.ose 
schema are identical in order to preserve the integrity of 
the "target" dictionary. 

F. EVALUATION 

It is cectainly true that the FIPS DOS presents the 
reader with vacy detailed specifications of the commands and 
facilities foe a standardized data dictionary; the volumes 
we reviewed could serve as the basis for an initial design 
specification for the development of data dictionary soft- 
ware. A dictionary based on the FIPS specifications would 
perform the required functions discussed in Chapter II and 
would contribute to the organization's management of its 
data. The militacy and the federal government would benefit 
greatly from the availability of standard software to 
achieve control over its data resource. 

The major contribution of the FIPS DDS is its orienta- 
tion to the needs of the different kinds of users of a data 
dictionary. This is particularly evident in the interface 
that is suggested for use by inexperienced users of the data 
dictionary. The panel-driven focmat with its six informa- 
tion areas is far less intimidating than the syntax required 
by the command language. Even so, the interactive interface 
still requires a certain degree of sophistication on the 
part of the "inexperienced" user if he is to be able to 
manipulate the dictionary. Another strong point of the FIPS 
DDS is its consistency of presentation and format. No 
matter what the operation, the procedures needed to manipu- 
late the dictionary and the mannec in which the dictionary 
"responds" to the user ace logical and predictable. The 
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commands, however, are complex and reguire knowledge of 
Backus-Naur form. 

Even though the FIPS DOS does indeed provide a compre- 
hensive software standard for the computer professional, we 
do not believe that it achieves its goal of providing a 
guide for the evaluation and selection of data dictionaries. 
Although the addition of the introductory volume may help 
remedy the problem, the three voLumes of specif ications 
ignore the forest of reasons behind the implementation of a 
data dictionary while concentrating solely on the patterns 
of the leaves on each tree. The FIPS DDS will not be 
extremely useful to the individual searching for basic 
assistance in evaluating commercial data dictionary pack- 
ages. Many of the books and articles we have reviewed 
provide better explanations of data dictionary features and 
comprehensive evaluation criteria. 

We found that the terminology that the FIPS DDS uses for 
the dictionary schema and the metadatabase is not explained 
clearly nor is it any less confusing than that of any other 
publication. In addition, no specific examples of how an 
organization’s data would be entered in the data dictionary 
are given. We feel that it is more important for the poten- 
tial data dictionary user to understand how a data 
dictionary will assist in the management of data than to see 
samples of every conceivable type of error message that 
could occur. A sum mar y of recommended features such as the 
one we have just presented and a list of criteria for evalu- 
ation would be far easier for the reader to digest. 

None of the data dictionary packages we have reviewed 
does things totally the "FIPS way", and it is unlikely that 
any commercial dictionary vendor will ever conform exactly 
to FIPS DDS guidelines. However, it is likely that the 
federal government will insist that FIPS standards be 
incorporated into future dictionaries intended for govern- 
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meat use. In the next chapter we will develop a set of 
criteria for an "ideal" data dictionary, talcing FIPS DDS 
recommendations into account. In Chapter 7 we will examine 
foar commercial data dictionary packages and evaluate their 
success in meeting the ideal criteria. 
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IV. THE BOLE DF THE DATA DICTIONARY IN INFORMATION RESOURCE 

MANAGEMENT 



In this chapter we will see how a data dictionary can 
contribute to the goal of efficient management of an organi- 
zation’s data. We will first discuss the process of devel- 
opment of an information system ia an organization and then 
will discuss the three objectives of data dictionaries that 
we have identified as contributing the most to the accom- 
plishment of this goal: data security, data integrity, and 
documentation/maintenance. We will then develop a set of 
criteria for the "ideal" data dictionary to be used in the 
evaluation of data dictionary packages. 



A. INFORMATIDH RESOURCE MANAGEMENT 

Organizations today have become increasingly aware of 
the need to manage data just as they manage other essential 
resources. If properly managed, the necessary data will be 
available, up-to-date, and retrievable when required to 
provide information that is of value to the organization. 
This concept is known as Information Resource 1 a nagement, or 
IR 3 , although it might also be refereed to as Data Resource 
Management. 

IRM has been the focus of a great deal of interest in 
recent years. In October of 1980, the Institute for 
Computer Sciences and Technology of the National Bureau of 
Standards (HB3) and the Association for Computing Machinery 
(ACM) co-sponsored a workshop on IRM strategies and tools. 
It was based on the premise that 

IRM is currently one of the most significant topics 
being discussed concerning information systems, ana ls 
oeing discussed along a variety of lines of thought. 
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rhese include business systems planning; information 
systems analysis, design, and development; database 
design and implementation; the disciplines of office 
management, paperwork management, and information 
sciences management; and the various problems and costs 
associated with implementing ISM to inclule each of 
these areas. [Ref. 7] 



The Proceedings of the workshop defined IRM as 



whatever policy, action, or procedure concerning infor- 
mation (both automated and non-automated supported) 
which management establishes to serve tae overall 
current and future needs of the enterprise. Such poli- 
cies, etc., would include considerations of avail- 
ability, timeliness. accuracy, integrity, privacy, 
security, auditability, ownership, use, and cost effec- 
ti veness. [ Ref. 8 ] 



The recommendations of the NBS/ACM workshop on the role that 
the data dictionary should play in IRM were incorporated 
into the Federal Information Processing Standard for D at a 
Dictionary Systems that we discussed in Chapter III. 

In order to understand how the data dictionary contrib- 
utes to the production of valuable information for an orga- 
nization, we will look more closely at the organization 
itself and at its functions. An organization is made up of 
many systems that convert resources into usable output. An 
information system, then, is one that takes raw data and 
transforms it into information that can be used by the orga- 
nization. If the process by which the organization develops 
its information systems is the heart of information resource 
management, then it is the data dictionary taat keeps it 
ticking. 

Assume that the U.S.S. Constellation has identified a 
problem with the way a particular information system is 
currently operating — it could be . preventive maintenance 
record-keeping, the supply deparmtnewt inventory, the 
personnel administration system, or a system that affects 
the entire organization. The process of analyzing the 
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system and developing a system to salve this problem evolves 
through four distinct phases, called the System Development 
Li.£§ Cycle (SDLC) . He will show how the data dictionary 
supports the SDLC, and thus, IRM, through planning, study, 
design/coding, and operation and maintenance. rfe have based 
our analysis of the SDLC on that of Leong-Hong and Plagman 
[Ref. 9]. 

1 . Planning. Phase 

The Proceedings of the NBS/ACd workshop emphasized 
the need for a "top-down" approach to IRM in an organiza- 
tion. During the planning phase, the organization's long- 
raage plans, its functions, and structure are analyzed to 
ensure that any information system that is developed will 
complement those needs. 

If a data dictionary is already in existence, it can 
provide information about the functions of the organization 
that have beea defined, or it can document the initial defi- 
nition of those functions. ‘ For each function, it must be 
determined who does it, what is produced, what other func- 
tions it interacts with, and what inputs are needed to acom- 
plish the function. As an example, we can say of the 
Payroll function that it is performed by the disbursing 
office, paychecks and leave and earnings statements are 
produced, it interacts with the personnel administration 
system, and it reguires data about all members of the crew, 
including rank/rate, time in service, and so on. 

At this stage of the development process, the "big 
picture" is drawn while the details are left until later. 
Thus, general categories of data such as "accounting data" 
and "personnel data" and the transactions that affect them 
are defined and entered in the dictionary. 

In the aggregate, this planning information consti- 
tutes a conceptual d ata model. "Definition and analysis of 
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subsequent information requirements 
base design) will be dependent 
[Bef. 10]. The fact that the deve 
been automated, rather than manual, 
dardized process. 

2 • St udy Phase 

At this point in the SDLC, 
is introduced. The data dictionary 
dardized source of information abou 
of the organization's functions, 
butes, and relationships are chosen 
ries of data identified in the pla 
PART in the Constellation's in 
described by the attributes Navy P 
Storage Location, and Quantity, 
to-many relationship assigned betw 
Reports required to be produced a 
necessary input data is identified. 

This information provides w 
cone ejqtual model, an expansion of 
the planning phase. The data di 
identify redundancy within the da 
whether the data entered already ex 
the aid of the dictionary, the syst 
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able to determine what data is available, 
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3. ^esign/Zoding Phase 

The purpose of the design phase is to pcovide speci- 
fications for programming and implementing the system. It 
is here that the data dictionary’s schema descriptors will 
be used or expanded to meet the needs of the system. If a 
database does not already exist, and it is determined that 
one is reguirad, the data dictionary schema will provide a 
basis from which to implement one. Data Integrity is 
enforced because the dictionary serves as the sole source of 
data definition and structure. 

When software is being coded, the data dictionary 
provides documentation for the programmer and a COPY 
facility for transporting record definitions, for example, 
into the program being developed. An important element of 
the dictionary is the constraints that ace defined for data 
values. In this way, data that is input to a program can be 
checked against the constraints that have been established. 
Documentation of the program includes the author, a descrip- 
tion, input requirements, output produced, and information 
on what other programs are called upon, all of which are 
incorporated into the data dictionary. 

4. Operation and Ma intenance 

After a new system has been implemented, the work of 
the data dictionary does not end. All of the documentation 
that has been recorded during the development of the system 
serves as a base of reference for the users of the system. 
In addition to the database administrator and the adminis- 
trator of the dictionary, the key players La information 
resource management who benefit from the use of a data 
dictionary fall into six groups, according to Allen, Loomis, 
and Mannino: [Ref. 12] 
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1. The data administrator, who is responsible for the 
overall administration of the data resource, uses 
the dictionary as a tool to enforce the way data is 
stored, maintained, and monitored. 

2. Data processing managers oenefit from the diction- 
ary's reports on data usage. 

3. Operations personnel retrieve information from the 
dictionary about jobs that are being run. 

4. Programmers and analysts use the dictionary to 
retrieve data definitions and to document a system 
being developed. 

5. End users access the data dictionary for descrip- 
tions of their dataviews. 

6. Finally, auditors will use the documentation 

provided by the data dictionary to trace data and 
programs as they are used in the computer system. 

It is the process of implementing a data dictionary that 
we have just described — the analysis of the organization, 
the definition of its functions, and the documentation of 
its information systems — that makes the dictionary so impor- 
tant in information resource management. Re have seen that 
during the development of an information system, the data 
dictionary is involved from the initial planning stage, 
through the programming process, through the operation, and 
into the maintenance of the system. The dictionary provides 
the standards for data which will be used throughout the 
life of the system and referenced when developing other 
systems. Key contributions include decreasing the amount of 
redundancy of data required to be stored, enforcing security 
of the valuable data resource through access controls and 
implementation of user views, and providing documentation 
which serves as a "corporate history" and as a reference 
upon which maintenance and auditing are based. These objec- 
tives of data dictionary usage are discussed in detail in 
the next section. 
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B. OBJECTIVES DF A DATA DICTIONARf 



In this section we will focus on the three major contri- 
butions of the data dictionary to the management of an orga- 
nization’s data. These ace data security, data integrity, 
and documentation/accountability. Although we recognize 
that other objectives of data dictionary usage might be 
identified, we believe that each will fall into one of these 
three major groupings. 

1 • Data Se cu rity 

There are two distinct levels of security of the 
data in an organization’s database which will be provided 
either by the data dictionary or by the database management 
system itself. First, procedures should exist to ensure 
that only authorized personnel ace allowed to access the 
information contained within the database. The widespread 
use of computers and the increasing sophistication of users 
has made an organization’s data vulnerable to embezzlers, 
amateur "hackers", corporate spies, and careless employees. 
Second, the system should contain provisions foe controlling 
the amount and types- of data that each authorized user is 
allowed to access within the system. Some of the sophisti- 
cated data dictionaries, for exampLe, include a trace mecha- 
nism which increases security by recording every inquiry 
that is made into system files and data. If an intrusion is 
made into the system by unauthorized personnel, the 
specifics of that inquiry, including the data which was 
accessed, will be recorded. 

Metadata should be afforded at least the same 
protection, if not more, than the data in the database. 
Leong-Hong and Plagman [Ref. 13] present an example of the 
importance of the security of metadata as it concerns 
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the data resources in intelligence/military applications 
such as the classif ication code of intelligence docu- 
ments. When security profiles far the metadata entities 
are stored in the metadatabase, unauthorized access to 
the metadata could be most damaging. This is because 
oresumably one would be able to 'crash into' the system 
using that information. 

It is the task of the dictionary administrator to analyze 
the metadata to determine the levels of security required 
and to grant access privileges (read and write, read only, 
update) to users for certain portions of tae metadata. 
Information about users, their password, and privileges is 
stored in the data dictionary and is accessible only to 
personnel authorized by the administrator. 

We have already shown in Figure 2.3 that subschemas 
contribute to security by limiting the size of the "window" 
through which a database user looks at data. When a user 
attempts to access a particular subschema, the request is 
routed through the data dictionary to determine whether 
access is authorized and, if so, the structure of the 
subschema. Dnly at this point is the "real" data in the 
database accessed. 

2. Da ta Integrity 

The keys to data integrity are the control of inputs 
to the database and the minimization of data duplication. 
Properly used, these keys will enhance communica tion between 
users by ensuring that a single, correct source of data is 
maint ained. 

Because the data in a database is shared among many 
users, it is essential to have some means of enforcing stan- 
dards for entering data, updating it, and maintaining it. 
For example, the data dictionary identifies constraints, or 
limitations on the values data can have. Fields can be 
defined as being mandatory or optional, alpaanumeric or 
numeric, and a minimum or maximum length. The data 
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dictionary contains comments on how data should be used in 
order to assist those using the data dictionary. Another 
important control feature of a data dictionary is how it 
deals with synonyms--an entity or attribute with more than 
one name. For instance, the entities EMPLOYEE , 
RE3I0NAL_S ANA3E3 , and EXECUTIVE may all be used by different 
departments in the organization to refer to Linda Smith. 
The administrator must standardize the terminology used in 
the organization and eliminate as many synonyms as possible. 
Uhen this is not feasible, all of these synonyms, or 
alki§§§, must be recorded in the data dictionary. Van Duyn 
[Ref. 14] explains that 



It is not unusual to have similar types of data elements 
in the database and in various applications. In such 
cases, and in cases where the same data type is known by 
other names, the DDS 'data dictionary] can be used to 
inform the users of the relationships that exist among 
these data and of the disposition or their usage. In 
other words, the DDS provides information as to which 
modules/programs and systems use the same data type and 
how they relate. 



The data dictionary also contributes to data integ- 
rity because it ceduces the necessity for duplication of 
data and therefore lessens the opportunities for error. The 
information about the components of different subschemas of 
the same logical view is stored in the data dictionary in 
place of the data itself. A user, whether writing a program 
or creating a new entity-type, should be able to guery the 
data dictionary to ensure that the necessary routines or 
entities do not already exist within the system. Perhaps 



one of the most important benefits of DDS [data diction- 
aries] is that because it gives accurate, and timely 
information, management can control more efficiently not 
only the automated and manual data of the enterprise but 
all its resources and operations. Consequently, manage- 
ment is provided with precise and accurate data for 
guick, profitable decision-making [Ref. 15]. 
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Thus, the possibility of two users querying tha database and 
receiving different answers to the same question at the sane 
time is decreased, 

3. Documentatio n / Maintenance 

Eecause maintenance is the most expensive and time- 
consuming phase of software development, documentation and 
maintenance of the organization’s data is probably the most 
significant objective of the data dictionary. It is a fact 
of software life that documentation is often avoided during 
system development and program design. To a large extent, 
this is because documentation can be prepared as an "after- 
thought”; it is not essential to tha operation of the 
system. But when a system is developed that includes a data 
dictionary from the beginning, the data which is required by 
tha data dictionary forces documentation to become an inte- 
gral part of the design. "The usa of a dictionary provides 
documentation of a quality and form that is simply not 
available through less formalized procedures in the data 
processing environment" [Baf. 16]. 

The data dictionary can also reduce the amount of 
effort requirad by maintenance personnel because it provides 
"a ’roadmap’ for the programmer doing maintenance. It 
records the programs being maintained, their data structures 
and their relationships" [ Bef. 17]. tfe hava defined an 
active data dictionary as one in which information is 
created, accessed, or modified through the data dictionary 
interfaces with naw or changed metadata automatically stored 
in the data dictionary. This "continuous maintenance" can 
be used to allow the database administrator to monitor where 
data is used, who uses it, how often it is used, and what 
changes have been made to it. Because the data dictionary 
provides a wealth of documentation, it is possible to trace 
an "audit trail" through the organization's data, from user 
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names and department to the kind of data used in a program 
to how many records a certain field appears in. Also, 



The tracking of how programs/modales use particular data 
as well as which f iles/sagments contain certain data is 
extremely important to the systems analyst in performing 
system changes. Through the DDS [data dictionary], he 
or she is able to ascertain what impact the proposed 
changes will have on other components of tha system and 
upon functional areas within the enterprise. By having 
an accurate, up-to-date assessment of the Location and 
usage of data that will be involved in the system 
change, the analyst can accomplish the task more 
efficiently [Ref. 18]. 
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C. THE IDEAL DATA DICTIONARY 

Having identified the functions of a data dictionary in 
Chapter II and how they support the accomplish ment of the 
objectives just discussed, it will be helpful to use these 
concepts to evaluate data dictionaries. Tha "ideal" data 
dictionary would be one that possesses all the capabilities 
necessary to support all potential users in all possible 
applications. However, this ideal dictionary would be 
impossible to conceptualize , much less to create. The ideal 
data dictionary for an organization will depend on the orga- 
nization’s size, functions, and needs. The potential users 
of a dictionary will have to develop a set of criteria upon 
which a candidate will be judged. 
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Many references provide criteria for evaluating data 
dictionaries and identify these charac te ristic s which are 
vital to the management of the data resource. 
Unfortunately, it is difficult to find two references that 
propose the same criteria. One excellent source, Leong-Hong 
and Plagman [Bef. 19], lists nine categories for evaluation: 

1. data description facility 

2. data documentation support 

3. metadata generation 

4. security support 

5. integrity support 

6. user interface 

7. ease of use 

8. resource utilization 

9. vendor support 

It is important to recognize a distinction between two 
categories of criteria for the ideal data dictionary: those 

that evaluate the vendor and operating environment, and 
those that evaluate the data dictionary itself. In the 
former category, items like vendor support and reliability, 
the choice between free-standing or DBMS-dependent data 
dictionaries, the degree of integration with other system 
components, and the quality of system documentation are 
important considerations that may drive the decision between 
two comparable data dictionaries. It is, oowever, the 
latter type of criterion that will be vital in identifica- 
tion of the essential requirements of the ideal data 
dictionary. We have grouped all such requirements into six 
categories: system standard schema and extensibility, 

command and query languages, ease of use (including menus) , 
security, documentation and reports, and application inter- 
faces. (We have assumed that the objective of data integ- 
rity will be accomplished by the correct, and enforced, use 
of any data dictionary.) If a particular dictionary fully 
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supports each of these six criteria then it wiLL most likely 
meet all of the organization’s data management needs. 

1 . System Standard Schema and Extensibility 

The ideal data dictionary mast provide a system 
standard schema with all the descriptors necessary to 
support the range of applications required by the organiza- 
tion while still being simple enough to be competitively 
priced. It must provide '’enough" descriptors to be fully 
capable without providing so many that the schema becomes 
confusing. Additionally, the ideal dictionary must support 
the user (or data dictionary administrator) in modifying 
existing schema descriptors and creating new entities, rela- 
tionships, and attributes. This extensibility is vital in 
supporting applications specific to the organization’s 
need s. 

2. Command and Que ry Languages 

The ideal dictionary must provide both command and 
query languages. The command language must support creation 
and modification of data structures and subsequent entry of 
data into those structures. The command language must 
include edit commands to facilitate addition, modification, 
and deletion of system data. It should incLude commands 
restricted to use by the data dictionary administrator, 
e.g., password assignment. The ideal system will include a 
query language to support the analysis and production of 
usable information from the organization’s data. Perhaps 
one of the most important features of a data dictionary (and 
database), query languages allow data to be screened in 
order to provide concise and specific information to support 
timely management decisions. 



43 



3. Ease of Use 



Ease of use, or user-f riendliness, is another impor- 
tant aspect of the ideal data dictionary. It must be 
supportive of nea users while still providing full func- 
tional support of the system "experts”. Two primary ingre- 
dients of user-friendliness are the availability of menus 
aad carefully conceived examples in the dictionary's refer- 
ence manuals. A hierarchy of menus can reduce complex oper- 
ations to a series of smaller, friendlier steps while user 
documentation provides easy-to-uaderstand examples that 
guide the inexperienced user through each phase of system 
operation. As microcomputers and the concept of the auto- 
mated office continue to spread, ease of use will become an 
even more important consideration in deciding which software 
products to utilize. 

4. Security 

Security will be a vital concern of the ideal data 
dictionary. Protection and control of system information 
must be provided. The data dictionary administrator must be 
provided the capability to control personnel access to 
system data. He or she must also be able to grant different 
degrees of access to different users. Similarly, users 
should have the capabilities to protect, and grant access 
to, those structures and data which they control. 

5. Documentation and Seports 

The documentation and reports created by the ideal 
data dictionary must also be clear and understandable. 
Timely and accurate preparation of reports is a key objec- 
tive of any DBMS. The data dictionary is uniquely qualified 
to assist with this function. By ensuring the integrity of 
data accessed and supporting query commands, the ideal data 



44 



dictionary can provide reports and documentation to answer 
specific questions as they arise. 

6. Application .Inte rfaces 

The final important characteristic of the ideal data 
dictionary is its ability to interface with the other appli- 
cations that may exist in the organization. If the data 
dictionary is free-standing, it should interface with many 
of the currently available database management systems. If 
DB!lS-dependent, the dictionary snould interface with all 
components of that system. Additionally, the ideal data 
dictionary should interface with code generators, communica- 
tion systems, and other agents of the users' environment. 

In the following chapter, we will study and evaluate 
four of the popular data dictionaries that are currently 
available. He will use these cha racteristics of the ideal 
data dictionary that we have defined to compare and contrast 
the features of the four dictionaries. In addition, each 
will be compared to "standard" dictionary presented in the 
FIPS DDS . 



45 



V. EVALUATION OF COMMERCIAL DATA DICTIONARIES 



The purpose of this chapter is to review and evaluate a 
cross-section of commercial data dictionary packages. fie 
selected four dictionaries: DATA DESIGNER, DATAMANAG2R, 

ORACLE, and DATADICTIONARY. User documentation and library 
sources were the primary sources of information for oar 
evaluation. Additionally, ORACLE was available on the Naval 
Postgraduate School’s Vax minicomputer, and we observed 
demonstrations of DATA DESIGNER and DATADICTIONARY. 

A. DATA DESIGNER 

DATA DESIGNER is a free-standing data dictionary devel- 
oped by Database Design, Inc. It was introduced in 1975 
with the goal of supporting logical database design by 
solving some of the traditional problems associated with 
muliple-applica tion database management systems, such as 
duplication of data, excessive storage requirements, data 
consistency, complexity, and modifiability. DATA DESIGNER 
can be used in conjunction with a variety of database 
management systems, including IMS, IDHS, ADABA3, NOMAD, and 
others. Additionally, it can produce designs that will 
interface with COBOL and other non-DBMS tools or systems. 

DATA DESIGNER can be characterized as an 

automated, easy-to-use tool that assists the database 
designer in formulating normalized views of the data 
requirements and synthesizes these views into a canon- 
ical normalized form. . . . DATA DESIGNER maintains 

information needed to physically structure tie database 
for efficient performance [Ref. 20]. 

I 

In addition to providing the standard functions of a data 
dictionary, DATA DESIGNER goes several steps beyond. It 
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provides an extensive set of commands categocized as user 
commands, edit commands, and plotting commands, as shown in 
Taole 1. It also supports limited production of models and 
graphics. Furthermore, DATA DESIGNER'S capabilities include 
powerful generation options and report features that will 
support the design and maintenance of applications. 



TABLE 1 

Standard Commands of OATA DESIGNER 
User Commands 



ADD 

copy 

END 

HELP 

PRINT 

SHOW OPTIONS 



BATCH 

CREATE 

FILES 

HIERARCHY 

RENAME 

TRANSFER 



BUILD 

EMPTY 

GENERATE 

PLOT 

REPORT 

VALIDATE 



Edit Commands 



DELETE 

LIST 



EDIT 

RENUMBER 



INSERT 

REPLACE 



Plotting Commands 



DRAW 
SET ALT 
SET TITLE 



DONE 

SET DEVICE 
SET TYPE 



RETURN 
SET RANGE 
SHOW 



DATA DESIGNER supports logical database design through a 
fi7e-step process: 

1. A data dictionary file is created that contains a 
list of all standard data item names to be used. 

2. Subschema files are created that describe all of the 
views necessary to support user data requirements. 
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. The encodes user views are validated. This step 
verifies the syntax of each view and ensures that 
each data item name listed in a view is in the data 
dictionary. 

4. All of the verified user views are synthesized into 
a logical data model. Reports and diagrams are 
generated to reflect this model. 

5. The model is evaluated to ensure that it meets all 
user requirements and is modified as necessary by 
repeating steps (1) througa { '4 1 . 

DATA DESIGNER utilizes three kinds of files: dictionary 

files, subschema files, and generated design files. A 
dictionary file ($DIC) contains a list of all data elements 
that will be used in an application or subschema. This list 
secves as a base for further development, e.g., additional 
views. A subschema file ($S0B) contains data items and 
relationships pertaining to particular views. Finally, the 
generated design file ($DE3) contains a logical data model 
generated by DATA DESIGNER using the applicable dictionary 
and subschema files as input. The generated design files, 
in turn, serve as the input for the report and graphics 
funct ions. 

Key commands utilized during the creation of a logical 
database design include the following: 
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PLOT — uses the plotting subsystem to draw the 
logical design. 

EDIT — supports modification of existing files when 
necessar y. 
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In order to acquaint the reader with the operation of 
DATA DESIGNER, we will demonstrate the dialog associated 
with each step of the process necessary to create our Naval 
Postgraduate School database example of Chapter II. The 
user of DATA DESIGNEE must first create the dictionary file 
STUDENT. DIC and the subschema file STUDENT. SU3 (user inputs 
are indicated by boldface type) as follows: 

>CREATE STUDENT. DIC DICTIONARf 
DDFC0101I File "STUDENT. DIC" of type "$DIC" created. 

>CREATE STUDENT. SUB SUBSCHEMA 
DDFC0203I File "STUDENT. SUB" of type »$SUB" cceated. 

Next, the BUILD command is used to load data items into the 
two created files. First all possible data items are listed 
in the dictionary file: 

>BUILD STUDENT. DIC 
DDBS0065I The file type is $DIC. 

DDBS0018I There are no records in the file. 

B>NAHE 

B>SSN 

B>SERVICE 

B>RANK 

B>DONE 

DDBS0064I File building is done. 

DDBS0068I 4 records were entered 

DDRH0098I Line 1100 is now the last line in your file. 

The subschema file will support creation of one or more user 
views. In our example, the suoschema file contains two 
views, the basic, overall view and the view intended for 
Army use only. Notice that after the user enters the BUILD 
process, each line must start with a modeling code. These 
codes are used to identify components and to establish rela- 
tionships within the views. When building the subschema 
files, all desired relationships must be specifically 
stated. DATA DESIGNER uses "1" to specify a one-to-one 
relationship and "M" for a one-to-many relationship. A 
complete list of the modeling codes used in this example 
appears in Table 2. 
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>BUILD STUDENT. SOB 
DDBS0075I The file type is $SU3. 

DDBS0081I There are dd records in the file. 
B>V,STU-1 

****************************************** 



* THIS VIEW SUPPORTS THE OVERALL VIEW * 
****************************************** 

B>F,0 1 00 
B>T,0003 
B>K,SSN 
B> 1 , NAME 
B>1 , SERVICE 
B>1 ,RANK 
B> V.'STU-ARHY 

£ $ £ 4 4c* *********<c***********ie 

* THIS VIEW SOPPORTS THE ARMY VERSION * 
****************************************** 

B>F,0125 
B>T # 0002 
B>K,SSN 
B> 1 ,NAME 
B>1 .RANK 

****************************************** 



B>DONE 

DDBS0064I File building is done. 
DDBS0068I 13 records were entered 



TABLE 2 

DATA DESIGNER ModeLing Codes 



Code 



Modeling Use 



V 

F 

T 

K 

C 

S 

L 

M 

1 

N 

* 



Name a user view 
Specify frequency of use 
Specify reg f d response time 
Name a key 

Concatenate keys and data 

Concatenate keys in short way 

Label a data group 

Identify a multiple association 

Identify a single association 

Name an association 

Insert comments 



Once the dictionary and subschema files ar 
the VALIDATE command is used to ensure that aLL 
relationships in the subschema files are valid 
information previously specified in the dicti 
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DATA DESIGNER will respond wita the number of views 
processed, the number of lines real, and the number of vali- 
dation errors, if any, that were located: 



>V ALIDATE STDDENT. SOB STUDENT. DIG 

DDVS0013I Validation begins. 

PDVS0024I 2 Views were processed. 

DDVS0025I 13 lines were read. 

DDVS0015I 0 validation errors were detested. 



Once the files are successfully ? 
utilize the subschemas to generate 
for his or her application. 

The ten GENERATE options from 
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>GENERATE 
DDGS0032I 
DDGS0058I 
DDGS0214I 
DDGS028 1 I 
DDGS030 1 1 
DDGS0307I 
DDGS0307I 
DDGS005 4 1 



OPTION Q 5 6 9 10 TO STUDENT. DESI GN 
Design generation begins. 

The subschema file is STUDENT. SUB 
Option 4 ignores undefined links. 

Option 5 generates foreign key information. 
Option 6 generates candidate icey informatio 
Option 9 generates cross-ref erence info. 
Option 10 ignores freguency and timing info 
Design generation has finished. 



n. 



51 



I&BLE 3 

DATA DESIGNER Generate Options 



Option 



Purpose 



1 

2 

3 

4 

5 

6 

7 

8 
9 

10 



Generate unspecified associations. 
Suppress resolving redundant data. 
Suppress creating intersection files. 
Supress generating inverse links. 
Generate foreign key information. 
Generate candidate key information. 
Allow repeating data items In grouDs. 
Suppress generating single key groups. 
Generates cross-ref erence information. 
Suppress f reguency/ti ming information. 



i 



At this point, the logical database design is completed. 
When using the options specified iQ the example, a series of 
reports will be automatically generated. A list of reports 



TABLE 4 

Reports Available with DATA DESIGNED 



Report 



Type 



1 

2 

3 

4 

5 

6 

7 

8 
9 



Data Group Links Repart 
Canonical Schema Report 
Data Group Index Report 
Multiple Occurences of Data Items. 

Data Relation Report 

Group Candiates Keys Report 
Item to User Viaw _r oss-Ref er e nee 
View to Data Group Cross-Reference 
Group to User View Cross-Ref erence 



Data 
Da ta 
User 
Data 



created is contained in Table 4. To print these reports, 
the user's dialog will simply be 
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>REPORT 123456789 PRINTER FRD 

DDP20073I The reports were prints!. 

As a final aid in evaluation of the 
design, DATA DESIGNER is capable of prod 
(1| an overview of the logical database de 
hierarchical representation of that log 
produce the logical overview diagram, th 
is required: 

>PLOT ‘ 

DDPT0289I DATA DESIGNER Print Plo 

P> SET TIPE OVERVIEW 
P> SET TITLE LOGICAL-DESIGN 
P>DRA W FROM STUDENT. DESIGN 

DDFS031 01 Design STUDENT . DES IGN ' s desc 

DDNX0271I The overview plot generation 
P>RETURN 
P>END 

After using the printed reports and diagra 
database design, the user will, if satisfi 
design into a specific DBMS format, such 
DATA DESIGNER’ s EDIT capabilities to rev 
necessary. 

As discussed in Chapter IV, data di 
evaluated on the basis of their aocomplis 
integrity, and documentation/maintenance. 
a free-standing data dictionary that can b 
tion with a variety of DBMS and non-DBMS 
address the security aspect. It was a 
with the assumption that the parent syste 
DESIGNER interacts will handle access 
security-related functions. 

DATA DESIGNER does, however, recei 
maintaining data integrity and for the gu 
mentation. Because it is designed to su 
meat of logical database designs, it utili 
files to ensure that duplication of d 
through generation of cross-reference 
subschema is modified, DATA DESIGNER a 
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dictionary files in the subsequent design generations. The 
PLOT and REPORT functions provide a wealth of information 
about the design, its components, and all users of the 
subschemas. Relationships, both those included by the user 
and those produced by DATA DESIGNER, can be seen in written 
reports and visual representat inons. When modificatons and 

new designs are produced, the reports are automatically 
updated to reflect all changes. 

B. MSP DATAMANAGER 

DATAMANAGER, developed by MRP, INC. of Lexington, 
Massachusetts, is one member of the MANAGER family of 
dictionary-oriented software products. Other products 

include DESIGNMANASER, PR0JECTMANA3ER, SOURCES ANAGER, and 
TESTMANAGER. The entire line of products, while capable of 
batch operations, is designed specifically to support inter- 
active operations with IBM 360/3 70/30xx/4303 series (and 
plug compatible) computers. While DATAMANAGER is designed 
as a nucleus for further expansion or specialization, it 
provides all basic capabilites necessary to create and main- 
tain user dictionaries. Additional capabilites, available 
as a series of extra-cost, add-on modules, include: 

1. interfaces to IDMS, ADABAS, IMS, TOTAL, SYSTEM 2000, 
and other DBMS 

2. teleprocessing interfaces 

3. generation of COBOL, PL/I, or other source language 
data descriptions 

4. generation of DATAMANAGER data definitions from 
existing COBOL or PL/I source code 

5. interfacing of a DATAMANAGER dictionary to user- 
written programs 

6. status, audit, and security facilities 

7. extensibility through a user-defined syntax facility 
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DATAMANAGER can provide data dictionary capabilities to 
users utilizing a variety of hardwire/software combinations. 
By providing interface modules for several popular database 
management systems, DATAMANAGER is obviously more flexible 
than one that is tied to a single, distinct database system. 
However, DATAMANAGER’ s flexibility extends beyond the 
obvious : 
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The architecture, or structure, of the DATAMANAGER data 
dictionary is composed of four (or five) data files, called 
data sets in the user documentation. 

The source data set contains the data definitions as 
originally input into the system by the user. fihen the user 
modifies or appends changes, the data definitions are auto- 
matically updated within the file. 

The data entries data set contains all encoded data 
definitions generated by DATAMANAGER after evaluating the 
contents of the source data set. Data definitions are 
encoded to reduce the time reguicad for DATAMANAGER to 
process the information within the data dictionary. During 
this encoding process, relationships, aliases, and classifi- 
cations are also identified. 

The index data set is- an automated index containing the 
name and address of each entity definition that is in the 
source data or data entries data sets. The index data set 
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serves as a lata directory to support the fastest possible 
retrieval of entity definitions and associated data. 

The error recovery data set is used by the system as a 
temporary backup storage file. This capability was imple- 
mented to increase reliability by providing for automatic 
recovery of the dictionary contents in the case of external 
interruption or other system failure during a dictionary 
upda te. 

The log, data set is an optional capability that is 
highly recommended by MSP. All updating commands, associ- 
ated data definitions, and amendments ace logged into that 
file as they occur. Entries include command identification, 
full date, time, user, and status of all physical input/ 
output accesses. Additionally, the data administrator has 
the option of specifying that all commands directed to the 
data dictionacy be logged. When combined with other system 
backup facilities, this allows DATAMANAGER to be "rolled" 
focward from the last backup point in case full recovery is 
ever required. 



DATAMANAGER is a powerful syste 
of interactive commands to create, 
data dictionary contents. These 
listed in Table 5. DATAMANAGER pro 
of standard entity- types, relations 
types that form the system stand 
listed in Table 6. As shown in T 
only six entity-types in the s 
elements exist within the system 
hierarchy as shown in Figure 5.1. 
documentation reveals that DAT AM AS 
the capability to maintain all syst 
ease and simplicity of logical desi 



m that utili 
maintain, 
standard c 
vides a prede 
hip-types, an 
ard schema, 
able 6, DATA 
tandard sche 
as members o 
Discussion 
AGES strives 
em data while 
gn. 



zes a series 
and document 
ommands are 
fined series 
d attribute- 
These are 
MANAGER uses 
ma. Those 
f a logical 
in the user 
to provide 
m aintaining 



56 



<1 rr u P> 



< 



i 






TABLE 5 

DATAHANAGER Standard Commands 



ADD 

AUTHORITY 

DICTIONARY 

ENCODE 

GLOSSARY 

LIST 

PRINT 

RENAME 

SHOW 

WHICH 



ALSO 

BULK 

DOES 

END D MR 

INSERT 

MODIFY 

PROTECT 

REPLACE 

STATUS 

WHO 



ALTER 

COPY 

DROP 

FORMAT 

KEEP 

PERFORM 

REMOVE 

REPORT 

WHAT 

WHOSE 



i 



TABLE 6 



DATAMANAGER 



Standard Schama Descriptors 



PROCESS ENTITY-TYPES 

MODULE PROGRAM 

DATA ENTITY-TYPES 

FILE GROUP 

RELATIONSHIP-TYPES 
SEE 

ATTRIBUTE-TYPES 

ACCESS-AUTHORITY 

ALIAS 

COMMENT 

EFFECTIVE-DATA 

NOTE 

QUERY 



SYSTEM 

ITEM 



ADMINISTRATIVE-DATA 

CATALOGUE 

DESCRIPTION 

FREQUENCY 

OBSOLETE-DATE 

SECURITY-CLASS 
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types, within which it is possible to describe all 
elements and assemblages of data and the processess t 
act on the data. The number of member types defined 
the basic hierarchy has been kept as small as possible 
fhile meeting these requirements. [fief. 22] 




DATABASE 




;robp 







Figure 5.1 DATAMANAGER * s Hierarchy of Entity-types 
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At the lowest level, an ITEM is a fundamental element of 
data, the smallest unit within DATA MANAGER. A GROUP is a 
collection of items or other groups. The third entity-type, 
the FILE, can either be implemented as a traditional file 
organization (a collection of data groups, independent of a 
DBMS) , or as the equivalent association of data within a 
database. If DATAMANAGER is used with a database, another 
entity-type, DATABASE, will be provided with the database 
interface module, e.g., ADABAS, that is selected. The new 
member, in this case, ADABAS-DATA3 ASE, will either replace 
the FILE element within the hierarchy, oc coexist by 
residing between the FILE and MODULE elements. A MODULE is 
a collection of data that includes descriptions of a data- 
base (if used) , FILEs, GROUPS, and/or ITEMS. The module is 
the lowest unit that can directly or indirectLy manipulate 
data, and is a subdivision of a PROGRAM. The PROGRAM is 
defined in terms of collections of modules and those 
processes that input or output data to/from the system. A 
program is executable. A SYSTEM is the highest element cf 
the DATAMANAGER hierarchy and contains all subordinate data 



declarations. 

While DATAMANAGER stresses simplicity in the logical 
design of the system standard schema, it can be configured 
to be highly extensible. An add-on module, the User Defined 
Syntax Facility (UDSF), is required to support user declara- 
tion of schema descriptors. If present, this facility 
provides several unique capabilities. First, in addition to 
allowing the user to define his or her own entity-types, the 
module allows the data administrator to insert one (or more) 
of three standard sets of extended entity- types. These sets 
are : 



1. The Extended Data Processing Structure (EDPS) which 
provides additional entity-types frequently used 
within the data processing en vironement. These 

include PROCEDURE, SUBROUTINE, and DATASET. 
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2. The Structu red Analysis Structure 

provides entity-types frequently 
conducting structured design. Th 

SOBPRDCESS and D A TA S TR UC IU3 E . 

3. The Structured development Structure 
strives to provide all entity-types 
satisfy the requirements of the majori 
tial users. This collection of entity- 
all those found in the EDPS and SAS sub 

Second, the UDSF module supports user d 
a t tribute- types related to both system stands 
created entity-types. Three distinct c 

attribute- types are recognized within DATAKAM 
are : 

1. Gl obal (common) attribute^ types which 
all entity-types within the struc 
SECURITY-CODE. 

2. Generic attr ibu te- types which can be a 
of a specific standard entity-type, 
FILE. Whenever a user defined en 
created that uses the standard entity- 
as a base, the generic attribute- types 
dard entity-type will be passed into th 
type. 

3. Specific at t ribu te- types which allow th 
tailor an entity-type to satisfy th 
requirements of that organization. 

Finally, the UDSF module supports user d 
relationship- types in both forward and backwar 
This enables DATAMANAGER to support the thr 
relationship mappings we have previously descr 

Once DATAMANAGER is installed on the comput 
steps must be conducted before information can 
the data dictionary. First, an empty data di 
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be defined using Controller commands (restricts 
the data administrator), DICTI3N ARY, and 

Briefly, the dictionary must be created 
authority levels must be defined, and potentia 
be identified. As the second major implemen 
member entity- types, both standard and user-crea 
be defined. Every session with DATAMANA3ER is 
a "run”, in which a series of system commands, 
the user, are carried out. Every session must i 
the commands DICTIONARY and AUTHORITY. After c 
user documentation, this process will probably 
cult and confusing to most users, even to tho 
worked with other data dictionaries. DATA 

however, an impressive, powerful package in the 
experienced user. Our sample database, STUDEN 
entered as a FILE (or DATABASE, if implemented) . 
foe an individual student’s record becomes a GR 3 
each data element, e.g., service, SSN, etc., 
ITEM. The structure of our example, after imple 
DATAMANAGER, would appear as shown in Figure 5.2 
DATAMANA3EE aggressively supports each of 
objectives of data dictionary usage: data 

security, and maintenance/documentation. It e 

integrity through its hierarchical structure 
types, predefined standard schema relationships, 
tion of aliases, and automatic update procedur 
definitions and error-checking are used to v 
structural "correctness" of each entity, relati 
attribute as it is created or defined. Once 
DATABASE is defined, DATAMANAGER monitors input 
system structures by comparing the input to the 
ITEM’S characteristics. Each of the MSP product 
the DBMS interfaces, displays evidence that MS 
the importance of data integrity as a vital li 
cient and dependable control of data. 
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1 



00205 PRODUCE STUDENT LAYOUTS, 

00206 PRINT GIVING DESCRIPTIONS 



*********** *************** 

DESCRIPTIO 
************ 

LEN 

************ 
T 069 



************** 

LEVEL NAME 
********** **** 

1 STUDEN 



************** 
2 STUDEN 

************** 
2 STUDEN 

********** **** 
2 STUDEN 



************ 
T-NAME 0 50 

************ 
T-SSN Oil 

** ********** 
T-SERV 005 



****************** 

N OF STUDENT 
****************** 

TYPE REMARKS 
****************** 

GROUP STUDENT 

****************** 
C3AR 50 DIG/A 

****************** 
CHAR 3 NCJM, "/ 
2 NUM "/ 

****************** 
CHAR 05 DIG/A 



********* 

* 

********* 

* 

******* ** 
* 
* 

********* 

LPH-NUH * 
* 

* * * *** * ** 

", * 
"4 NUM * 
*** **$* = ** 

LPH-NUM * 



2 STUDEN 1 



it************************************** 

T-RANK 003 CM AR 1 CHAR,"-", * 

1 NUM * 



Figure 5.2 



STUDENT example in 



DATAMANAGER Structure 



The DATAMANAGER nucleus prov 
of one type of security mechanis 
Controller, or dictionary admi 
unique password to each authori 
password combination must be 
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which does not commence with an 
by an authorized password. 

Several additional security 
by including the Audit and Secur 
system implementation. First, 
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Protection Security Level. A usee whose general level is 
lower than the specific insertion level is not allowed to 
insert, modify, or delete information within the data 
dictionary. This provides the capability to assign "read 
only" access. A user whose general level is lower than the 
specific protection level, or one who does not have a 
general security level assigned, is not allowed to establish 
protection for system members, or data structures. 

Second, users who do have a general security level equal 
to or higher than the specific protection level may use the 
PROTECT command to assign protection to specific members in 
the form of ASSESS, ALTER, and REMOVE security levels. This 
capability allows key users to coatcol, or even prohibit, 
access to those structures that they own. Any member which 
is not owned but does require security can be assigned the 
same three control levels by the dictionary administrator. 

Finally, the Audit module provides the capability to 
produce over 500 different audit reports, using information 
contained within DATAMANA3E8 . The majority of these reports 
are reserved for use of the dictionary administrator alone. 
This includes the capability of logging all commands issued 
to the system. This "trace" mechanism increases security by 
providing a record of all entries, or attempted entries, to 
the system. 

The last significant objective of a data dictionary must 
be to support maintenance and documentation of the informa- 
tion contained within the information system. DATAMANAGER 
provides a set of commands unique to the maintenance func- 
tion. A listing of these is shown as Table 7. Maintenance 
can be supported during both interactive and batch sessions. 
A series of query and repoct commands are provided with the 
nucleus module to support usage studies, maintenance, and 
documentations. These commands are listed in Table 8. The 
REPORT, PRINT, and GLOSSARY commands provide a great deal of 
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TABLE 7 

DATAMANAGER Maintenance Commands 






INSERT 

REPLACE 

ADD 

REMOVE 
ALSO KEEP 



MODIFY 
BULK ENCODE 
RENAME 
KEEP 
PERFORM 



ENCODE 

COPY 

ALTER 

DROP 



j 



TABLE 8 

DATAMANAGER Raport/Query Commands 



PRINT 

SWITCH 

GLOSSARY 

TEXT 



WHAT 

WHOSE 



Report Commands 

BULK REPORT 

REPORT 

SPACE 



Query Commands 

WHO 

DOES 



LIST 

SKIP 

BULK PRINT 



WHICH 

SHOW 



information to the dictionary adminsis trator and other 
designated users. When system data is modified, the query 
and report commands can be used to provide updated documen- 
tation and records. 

One additional DATAMANAGER capability waccants mention 
with respect to maintenance and documentation. One system 
entity-type which has not been discussed and does not reside 
in the hierarchy shown earlier is the COMMAND-STREAM entity- 
type. This structure is a unique feature of DATAMANAGER 
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that allows previously stoned series of commands to be 
executed by using the PERF3RM command. The use of specific 
COMMA ND-STREAMs can be compared to the subroutines of a 
general programming lanugage. While the COMMAND-STREAM can 
be used in many ways within DAT AMA NA3ER, it becomes espe*- 
cially useful during generation of reports and documentation 
during maintenance sessions. A "subroutine" can be speci- 
fied that will produce all standard reports; when system 
information is updated, the applicable reports are produced 
by one simple PERFORM command at the end of the maintenance 
session. 



C. ADR DA TADIC HD NARY 

DATADICTIONAP.y is one of fourteen separate, but highly 
integrated, software products produced by Applied Data 
Research, Inc (ADR). Initially introduced in 1978, the 
integrated system. Relational Information Management 
Environment (RIME), is considered to be one of the first 
true examples of the fourth generation of systems software. 
An article in Inf osy stems states 



Three conditions are carta 
applications packages will 
applications that will be co 
software products that impr 
application costs and increa 
nation in the 1970s will be 
1980s. And third, existi 
readily rewritten or replace 
tained for many years. ... The 
organizations in tha 1990s 
tiveiv they improve and in 
their" operations. This is 
organizations that have been 
users over the last 20 years 
and third generation mainf 
systems. [Ref. 23] 



Prior to analyzing ADR’s data 
review briefly tha objectives 
and integrated systems and to 



in in the 1 980 s. ... First . 
not meet the need for most 
mputerizad. Second, systems 
oved productivity, reduced 
sed accessibility to infor- 
even more valuable in the 
ng applications will not be 
d and will have to be main- 
success or failure of many 
wilL depend on how effec- 
tagrate data processing in 
particularly critical for 
traditional data processing 
and have worked with second 
rame hardware and software 



dictionary, it is important to 
of fourth generation software 
provide an overview of RIME. 
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Each of the "generations" of system software can be 
identified by one or more significant advancements. The 
first generation provide! primarily assembly language 
programs. The second generation’s gifts centered around 
development of high-level languages and improved operating 
systems. Numerous advances surfaced during the third gener- 
ation/ e.g., database management systems, data dictionaries, 
structured programming techniques, early efforts at decision 
support systems, and program generators. During the fourth 
generation, it is anticipated that advances will occur in 
three primary areas: very high-level languages, relational 
database management systems, and the automated office or 
integrated information center. In the latter, all automated 
functions, including data processing, word processing, data- 
base and file management, decision support, program develop- 
ment and maintenance, and communications, will be combined 
into one "total" system. This could, in theocy, be accom- 
plished by one giant program, or, in the case of ADR and 
other vendors, as a series of smaller, integrated packages. 

During 1932, the 0. S. Army awarded a contract for the 
largest, most complex information processing project ever 
funded by the government. Named VIABLE (Vertical 
Installation Automation Baseline) , the project will provide 
a nationwide automated network that will connect forty-seven 
military bases to massive computer power at five regional 
data processing centers. The network has been designed to 
support the management of information in peacetime and in 
times of war and other national emergencies. During the 
planning period, interest centered on three principal func- 
tional areas: communication, interactive program develop- 
ment, and database management. The primary contractor. 
Electronic Data Systems, selected 11 of ADR’s products for 
use as the base of the VIABLE system. A complete list of 
ADR/RIME elements is included as Table 9. 



66 



TABLE 9 

Components of ADR’s DATC3M System 



Component 



Function 



DATACOM/DB 

DATA DICTIONARY 

DATAQNERY 

DATAREPORTER 

DATAENTRY 

COBOL/DL 

LIBRARIAN 

ROSCOE 

LOOK 

METACOBOL 
AOTOFLDR II 
ADR/D-NET 
ADR/EMAIL 
ADR/IDEAL 



Relational 
Resource Co 
English-lik 
Info. Retri 
On-line Dat 
Extended La 
Program .Ian 
Program dai 
Real-time H 
Language Pr 
System Deve 
Distributed 
Electronic 
Interactive 



Database 3y 
ntrol Syste 
e Query Lan 
eval/Repor t 
a Entry Sys 
nguage/Otil 
agement Sys 
ntenance ay 
easur ement 
e-compiler 
lopment loo 
Database d 
Mail System 
Develop. 3 



stem 

m 

guage 

ing 

tern 

ities 

tern 

stem 

System 

1 

et work 
ystem 



* New ADR products which have not yet bean 
included in the Army’s VIABLE project. 



Some of the 
priced extras 
organization we 
have access t 
dictionary, a 
generators, ax 
support, distr 
system, and mor 
According t 
system is DATA 
DATACOM/DB, a t 
a patented flex 
to interact wi 



se elements can be considered to be high- 
or application-specialized options. If an 
re to utilize all components, users would 
o a complete database system’ with data 
relational query language, report and graph 
tended COBOL compiler, program development 
ibuted local data network, electronic mail 
e . 

o ADR literature, the heart of the integrated 
DICTIONARY. The company’s database system, 
rue relational database 3 system that utilizes 
ible data structure, was designed especially 
th DATADICTIONA RY. As an active dictionary 



3 A relational database is one in which the relationships 
between data are implied by the values of the data. For 
example, two records are related if they have the same 
attribute, as STJDENT and PROFESSOR ace related by the fact 
that they are associated with a particular CLAS3. 
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system, DATADICTIONARY is queried by ail other components of 
the system prior to access of system information. This 
maximizes data integrity while minimizing data redundancy. 
DATADICTIONARY offers a menu-driven user interface. It 
provides security, supplies full documentation/maintenance 
capabilities, and can be extended to interact with future 
system products and to support future user requirements. 
The documentation provided with DAT ADECTIONARY and other ADR 
packages is almost overwhelming in its completeness. The 
dictionary alone has fifteen separate volumes. While an 
extremely capable system, DATADII TIONARY is not one that 
will be easily or quickly mastered. 

DATADICTIONARY provides 20 standard entity-types in its 
system standard schema and supports user creation of addi- 
tional, more application-specific schema descriptors. For 
most applications, the standard types listed in Table 10 



TABLE 10 

ADR DATADICTIOSARI Standard Entity-types 



DATABASE 

AREA 

FILE 

RECORD 

FIELD 



KEY 

ELEMENT 

LIBRARY 

MEMBER 

PANEL 



SYSTEM 
PR05RAM 
MOD J LE 
DAIAVIEW 
PERSON 



REPORT 

JOB 

STEP 

AUTHORIZATION 

NODE 



wiLl prove to be sufficient. DATADICTIONARY maintains a 
logical hierarchy among the principle standard entity- types, 
as indicated in Figure 5.3. Many of the standard entity- 
types are provided with primary relationships already 
defined with key subordinate entity- types. For example, in 
our STUDENT example, we will initially use the entity-type 
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Figure 5.3 A Logical 
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1 • Relationship- mapping -- describes the number of 
entity- occurrences which are the subjects and the 
objects of this relationship, e.g. the type of the 
relationship. DATA DICTIO MART supports four types of 

relationship mappings, i. e. one-to-one, one-to-many, 
many-to-one, and many- to- many. 

2. Required-rela tio nship — describes whether each 

entity-occurrence in the named object entity-type is 
to be related to at least one en tity-occurrence of 
the named subject entity-type. 

3. A ut omatic- rel atio n ship - describes waether each 

entity- occurrence of the named object entity-type is 
to be automatically related to an entity-occurrence 
of the named subject entity-type when the otject is 
added. 

4 • Ord ered- relat ion s hip - describes whether the order of 
relationships added in tais relations hi p- type is 
significant. An ordered-relationship allows entity- 
occurrences to be retrieved and displayed in a 
specific order. 

If using the interactive version, DATADICTIDNARY Online, 
the user will be prompted by a secies of panels, o:; menus. 
The Master Menu is displayed in Figure 5.4. The master menu 
supports creation, modification, and deletion of entity- 
occurrences. Additionally, it provides access to all other 
system menus through option (7). The following procedures 
would be utilized to create the STUDENT esample within 
DATA DICTIONA RY. First, the Add Detail routine, option (2) , 
is selected. In answering the system prompts, "he user 
creates the new entity-occarrence, DATABASE. STUDENT in the 
following dialog: 
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=> 

DDOL: SELECTION CRITERIA FOR DETAIL ADD 

LV TY ENTITY RECORD DD OCC URR ENCE NAME VER STAT 

00 E DATABASE STUDENT 001 

CURRENT OCCURRENCE QUALIFIER: 

DATABASE STUDENT (00 I) TEST 



ATTRIBUTE 

DESCRIPTION 

CONTROLLER 

AUTHOR 

BASE-ID 

BASE-TYPE 

DBMS-USED 



DETAIL ADD 

VALUE 

NPS STUDENT DATABASE 
DEPARTMENT OF RE3ISIR AR 
REGISTRAR 
001 

ADR/DB 

RELATIONAL 




MASTER MENU 

ENTER THE REQUESTED OPTION ==> THERE ARE 03 OPTIONS 



1. DISPLAY MENU 

2. ADD DETAIL 

3. DELETE DETAIL 

4. UPDATE DETAIL 

5. COPY 

6. STATUS CHANGE 
7- SUPPDRI MENU 

8. SECURITY 



MENU FOR DESPLAY FUNCTIONS 
ADD DETAIL E N TIT Y- OCCUR R ENC3 
DELETE DETAIL ENTITY-OCCURRENCE 
UPDATE DETAIL ENTITY-OCCURENCE 
COPY/MODEL ENTITY-OCCURRENCE 
CHANGE ENTITY-OCCURRENCE STATUS 
ALIAS, DESCRIPTOR, RELATIONSHIP, 
TEXT AND 0 L 13 ^ 

OCCURRENCE SECURITY MAINTENANCE 






Figure 5.4 ADR DATADICTIONARr Master Menu 



Each of the 20 standard entity-types will 
key attributes. Values for these at 
entered during the Add Detail routine. 
DATABASE entity-type, and as was shown abo 
butes are DESCRIPTION, CONTROLLER, 
BASE-TYPE, and DBMS-USED. 

In similar fashion, the user must crea 
logical structures, AREA. STUDENT, FI 
RECORD. STUDENT. As each occurrence is cr 



contain predefined 
tribute- types are 
In the case of the 
ve, the key attri- 
AUTHOR, BASE-ID, 

te the subordinate 
LE. STUDENT, and 
eated, it must be 
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reLated to the next highest entity- occurrence in the logical 
hierarchy, e.g., FILE. STUDENT must be related to 
AREA . STUDENT. For this process, the user invokes the 
Relationship Definition Panel to lefine the relationships. 
DATADICTIONARy will respond vith the Relationship Definition 
Display which presents the characteristic s of eacn of the 
relationships as it is enacted. Examples of these panels 
are shown below: 

=> 

DDOL: RELATIONSHIP DEFINITION 

RELATIONSHIP NAME SINTERS AL 

SUBJECT ENTITY TYPE DATABAS E. STUDENT 

OBJECT ENTITY TYPE AREA. STUDENT 



=> 



RELATIONSHIP DEFINITION DISPLAY 



SELECTION 

SINTERNAL 

NAME 

SINTERNAL 

As a final st 
commands must 
elements withi 
specific attri 
Service, and 
The user defi 
length, and 
process is as 



DATABASE. STUDENT 
SUB J TYPE OBJ TYPE 
DATEBASE AREA 

ep in installing the S 
be used to define speci 
n RECORD. STUDENT. This 
butes of the STUDENT ex 
Rank, are entered int 
nes attribute name, 
number of repetitions, 
follows: 



AREA. STUDENT 
MAP REQ AUTO ORDER 
1 a Y N N 



TUDENT 


data 


base 


, QLIST 


fic fie 


Ids, 


ke 


ys, and 


is the 
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nt w 


here the 


ample. 


e. g. 


, SS 


N, Name, 


o the 


data 


base 


design . 


parent. 


cl 


ass. 


type. 


One 


exam 


pie 


of this 



=> 



DDOL: SELECTION CRITERIA FOR RECORD QLIST SAINT 
LV TY ENTITY RECORD DD OCCURRENCE NAME VER STAT 

00 E RECORD STUDENT TEST 

CURRENT OCCURRENCE QUALIFIER: 

RECORD STUDENT (001) TEST 

fc************************^*************************** 

RECORD QLIST MAINTENANCE 

E FC FIELD NAME PARENT NAME INSERT AFT C T LEN REP 
A SERVICE SSN NAME S C 004 001 
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the user has indi- 



Looking at the last line of the figure, 

cated the following: 

FC {function code) = Add a field 
Field Name = SERVICE 
Parent Sane = SSN (in this case, this is tie Key field) 
Insert After = NAME (NUMBER’ s value will follow NAME’S) 
C (Class) = Simple (as oppose! to a compound field) 

T (Type) = Character (vice a numeric or binary field) 
LEN (Length of Field) =4 

REP (Numoer of Repetitions) = 331 (vice a repeating 

field) 

At this point, the schema of STUDENT has been entered into 
DA T A DICTION ART . The user may now use DAIACOM/DB facilities 

to enter attr ib ute- values into the system. Upon completion, 
the database administrator or authorized users can create as 
many external views, or subschemas, as desired. 

DATADICTIONART receives high marks in the areas of data 
integrity, security, and documentation/maintenance. 

DATADICTIONARI’ s logical hierarchy of structures and system- 
atic installation procedures tend to enforce data integrity. 
The dictionary’s extension routines and view generation 
processes have been written to ensure that data integrity is 
maintained throughout expansion or specialization of the 
database. To enforce security, DAT ADICTIDN A RT provides 
multiple layers of protection. Two separata and independent 
mechanisms are provided in all implementations. These are 
(1 ) use of entity passwords, and (2) inclusion of locks and 
override codes. If the installation is the Online version, 
a third mechanism, user validation, is available. As each 
entity is created, or at any time afterwards, a four-digit 
password can be assigned to that eQtity. Passwords can be 
either unique or assigned to a series of related entities. 
Any user attempting to modify or access a password-protected 
entity-occurrence will be queried to provide the applicable 
password prior to gaining access. The second layer of 
protection canters on use of LOCK and OVERRIDE cedes. 
Unlike passwords, which either allow or proiibit access, 
lock codes can be utilized to limit the deacee of access 
granted. Three levels of security are provided: 
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LOCKO 



No restrictions exist on an entity 
(default setting) 

LOCK1 The entity cannot be updated or deleted 

without an override code. The entity 
can be copied, displayed, or printed 
witnout restrictions. 

LOCK2 No action will be permitted unless the 

override code is given to tie system. 

The actual override codes will be used dictionar y- wide, that 
is, a single code will exist to satisfy LOTS 1 conditions 
while another code exists to access entities protected by 
L00K2. Finally, if using DATADICTI3NARY Online, the highest 
layer of security becomes user validation. The name of each 
user of the system is defiaed as a PERSON entity-type. Each 
entity-occurrence will include a unigue password which must 
be provided to eater the system through the online inter- 
face. Four levels of authorization are supported by 
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Authorization at one level will automatically provide all 
lower authorizations. 

ADR’s multiple-layered approach to security provides a 
system that is both highly flexible and very secure. The 
database administrator will be able to provide whatever 
degree of access that is required to each individual user as 
well as to each group of users within the system. If one 
layer of security is broken, access will be prevented by the 
other security mechanisms. 



Invocation of any function thus authorized on any entity 
is still subject to the password and lock provision 
discussed earlier in this section. Thus, a user with 
$DD_0PD authorization cannot modify an entity that is 
password protected unless the required password is 
supplied. [Bef. 24] 
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DATADICTI3 N ARY provides extensive capabilites to support 
naintenance and documentation of the data dictionary. It 
can be maintained by using either the Online maintenance 
facility or available batch commands. If using the online 
facility, a series of screen panels will again guide the 
usar through the desired maintenance activity. This 
facility will greatly enhance individual changes, however, 
major changes affecting numerous entities would be initiated 
most easily through batch eoo.mands. In either case, mainte- 
nance centers around four principal functions: 

1. adding, copying, updating, or deleting system 

entities 

2. search for, identification of, and creation of 

entity aliases 

3. maintenance of descriptors and schema descriptors 

4. maintenance of descriptive tests associated with 
system entities 

Similarly, DAT ADICTI3NARY provides numerous report 

generation capabilities, most of which can be initiated 
through either batch or Online Maintenance sessions. 

Principal report types are shown in Table 11. Senerated 
reports will support both the initial generation of user 
databases and subsequent maintenance of system data and the 
structures utilized to display it. 





Principal 


TABLE 11 

Reports of D&TADICIIONARY 




INDEX 


INDENTED 


DETAIL 




FIELD 


TEXT 


ALIAS 


. 


DESCRIPTOR 


RELATIONSHIPS 


DEFINITIONS 
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D. ORACLE 



ORACLE is a relational database management system devel- 
oped by Relational Software Incorporated of Menlo Park, 
California. It was originally developed Cor use with 
Digital Equipment Corporation PDP minicomputers and has been 
converted to operate on IBM mainframes as well [Ref. 25]. 
Included in ORACLE is a dependent data dictionary that 
performs a limited number of the functions discussed in 
previous chapters. 

Data is stored in ORACLE as relations, or two- 
dimensional tables, which are organized into rows and 
columns. SQL (System Query Language) is used for query, 
manipulation, definition, and control of the ORACLE data- 
base. Information about the contents of a table, its 
creator, authorized users, calling programs, and associated 
views is kept in the data dictionary and can be retrieved 
via SQL commands. 

ORACLE'S logical hierarchy of structures, as shown in 
Figure 5.5, demonstrates the comparative simplicity of this 
system. In this figure, a single arrowhead represents a 
one-to-one relationship while the double arrowheads signify 
one-to-many relationships. The dataoase is divided into 
logical partitions which can only be created or altered by 
the database administrator. When users define tables, the 
system allocates memory for one indexspace and one data- 
space. The indexspace is used by the database/dict ionary to 
store information about the table while the dataspace is 
utilized for storing the actual information. As data is 
entered into the database, the system automatically appends 
extents (and pages) as necessary to support specific tables. 

ORACLE'S IB data dictionary tables are described in 
Figure 5.6. An example of one of the tables, CATALOG, 
appears in Figure 5.7. Tables with the "SYS" prefix include 
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Figure 5.5 ORACLE’S Logical Hierarchy 

information on system data in addition to the user's data. 
For example, a display of SYSCATALOG might appear as Figure 
5.3. In this particular example, there are 23 entries, 18 
of which are system tables or views. 

ORACLE’S data dictionary is automatically updated when- 
ever any additions or deletions are made to the database or 
when views are defined or user privileges are changed, so it 
alvays has a current description of the datamse. As an 
example, assume a new view, NAVYVIEW, is created using the 
SQL CREATE command: 

JFI> CREATE VIEW NAVYVIEW AS 

2 SELECT NAME, S S N, RANK 
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DTAB 

- Description of tables S views in Oracle Data 
dictiona ry 

SYSC ATALD 3 

- Profile of tables S views accessible to user 

CATALOG 

- Profile ot tables accessible to user, excluding 
data dictionary 

TAB 

- List of tables, views, clusters, and synonymns 
created by user 

SYSCOLUMNS 

- Specif icatio ns of columns in accessible tables 
and views 

COLUMNS 

- Specif ications of columns in tables (excluding 
data dictionary) 

COL 

- Specifications of columns in tables created 
by the user 

S YSI NDEXES 

- List of indexes, underlying columns, creator, 
and options 

INDEXES 

- Indexes created by user S indexes on tables 
created by user 

SPACES 

- Selection of space definitions for creating 
tables S clusters 

VIEWS 

- Quotations of the SQL statements upon waich 
views are based 

SYSTABAUTH 

- Directory of access authorization granted by 
or to the user 

E XTE NTS 

- Data structure of extents within tables 

STORAGE 

- Data and Index storage allocations for user’s 
own tables 

SYSSTORAGS 

- Summary of all database storage — for DBA 
use only 

SYSUSERAUTH 

- Master list of Oracle users -- for DBA use only 

SYSEXTENTS 

- Data structure of tables throughout system 

— for DBA use only 

PARTITIONS 

- File structure of files within partitions 

— for DBA use only 



Figure 5.6 Tables of the ORACLE Data Dictionary 



3 FROM STUDENTS 

4 WHERE SERVICE = ”USN" 

View created. 
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Figure 5.7 ORACLE CATALOG Listing 
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VIEW 


268800 



Figure 5.8 ORACLE SYSCATALOG Listing 

Open completion of this dialog, all ORACLE data dictionary 
files will have been automatically updated to include the 
new view. The CATALOG table would now appear as shown in 
Figure 5.9. 

ORACLE provides security by using its data dictionary to 
control access within the database. The database adminis- 
trator (DBA) provides the first level of access by entering 
the user's name into the data dictionary's SYSUSERAUTH 
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NAME 

STUDENIS 
ARMYVIEW 
NA VIVIEN 



CREATOR 

LANDIN 

OWENS 

LANDIN 



TABTYPE 

TABLE 

VIEW 

VIEW 



TABID 

228609 

268800 

288240 



i 



Figure 5.9 ORACLE CATALOG Listing With New View 



table. Initial privileges, or subsequent changes to author- 
ized privileges, are issued using the GRANT or REVOKE 
commands. ORACLE also supports multi-layered access: in 
addition to privileges authorized by the DBA, a user can 
grant various degrees of access privilege to others for 
tables or views which he or she has created. A list of 
current authorizations is maintained in the dictionary’s 
SYSTABAUTH view, as shown in Figure 5.10. 

ORACLE is a strong performer in the data integrity 
category. Since the data dictionary is an integral part of 
the database system, data is only maintained at one location 
within the database. This prevents two users from acquiring 
data from the database and getting different cesults. If 
data were duplicated within the system, it would be possible 
for one location to be updated while the otaer was not. 
Figures 5.7 through 5.10 show that the ORACLE user will deal 
mostly with subsets of the database, or subschemas. 

ORACLE'S documentation is limited to the information 
that can be found in the data dictionary tables. It does 
not provide information about which users use which data, 
how often data is used, or when it is used. ORACLE does 
support maintainability through automatic update of its 
tables and through the concept of data independence. This 
concept implies a separation of data definitions from the 
programs or queries that might access the data in the 
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Figure 5. ID ORACLE SYSTABAUIH Listing for User Owens 



database. This allows the structires or definitions of the 
data constructs to be modified without necessitating changes 
in the programs or queries that access the database. If a 
table is extensively modified, a view can be created to 
interface with current programs. ORACLE’S data integrity 
will maintain the currency of tae view by automatically 
updatiny the view whenever applicable portions of the 
governing table are modified. 

ORACLE does provide the basic functions of definition, 
update, retrieval, and software interface. However, like 
other relational database managmeat systems with dependent 
data dictionaries, it does not offer the range of functions 
of the other data dictionaries discussed in this chapter, 
nor does it accomplish satisfactorily the three main objec- 
tives of data management discussed in Chapter 17. ORACLE’S 
data dictionary 

provides little more than a method of defining the 
schema. The relational database management system ’dic- 
tionary’ arises because the system needs a wav to store 
the schema and it does this through the use of the same 
tables (relations) as it uses for the main database. 

I Ref. 26]. 



ORACLE could, however, serve 
further development. 



The modern relational DBMS 
basis for a good dictionary 
normal relational DBMS is 
that help in making the impl 

1. Many relational DBMS now 
that causes a procedure t 
condition or event. Such a 
DBMS to a dictionary system. 

2. The availability of the 
reduces the effort in i 
system. [Ref. 27] 



as a good starting point for 



does provide a very good 
system. Ihis is because the 
equipped with two features 
ementation easy: 
have a "triggering" feature 
o be invoked on some data 
feature is needed to tie a 

schema tables substantially 
implementing the dictionary 



The most important shortcoming of ORACLE’ s data dictionary 
is its lack of documentation, witaout which it is difficult 
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to manage all aspects of an organization's data. If this 
objective were incorporated into the system, ORACLE would be 
a much more valuable tool. 



COMPARISON OF DATA 

DAT A DICTION ARY, AND ORACLE 



DESIGNER, 



DATAMANAGER, 



are 


evaluated 


in each 


will 


be used 


to compar 


standards. 


Each chart 



Now that four representative samples of commercial data 
dictionaries have been evaluated, we will compare the 
primary features of each and identify which one(s) have come 
closest to providing the features of our ideal system. For 
ease of comparison, we have grouped all of the features, 
functions, and guidelines that have been identified into the 
six evaluation criteria categories: system standard schema 

£ extensibility, command and query languages, ease of use 
(including menus), security, documentation and reports, and 
application interfaces. 

As the data dictionaries are 
six categories, a brief chart wii 
dictionary against the FIPS standards. Each chart will 
compare five data dictionaries: 

FIPS = The idaal/FIPS data dictionary 
MSP = MSP DATAMANAGER 
ADR = ADR DAT A DICTION ARY 
DDE = DATA DESIGNER 
ORA = ORACLE DBMS/DD 

A very subjective scoring system will be used, with grades 
ranging from three to zero. The ideal/FIPS standard will 
automatically receive a grade of "3'' in each area, repre- 
senting the ideal combination of features. The meaning of 
each grade is as follows: 

"3" = Very strong performance by DD ; no criticism 

"2" = Good performance by DD ; one or more significant 
shortcomings 

"1" = (1) DD supports functional area very poorly; 

(2) DD does not support functional area, but 
another component of the system does. 

”0" = DD (and remainder of system) fails to support 
this function 



83 



First, the data dictionary should provide a system stan- 
dard schema and the capability to add new entities, rela- 
tionships, and attributes to it. is shown In Table 12, 
while DATA DICTION A E3f and DATA HAH ACER closely cesemble the 
ideal system proposed by the FIPS, DATA DESIGNER and, in 
particular, ORACLE fail to provide these capabilities. 
DATAMANAGER supports thcee "add-on" collections of schema 
descriptors. When added to the standard schema, each will 
increase DATAfl ANAGER* s capabilites to support a specific 
application, e.g., programming. 



TABLE 12 

Category One: Schemas and Extensibility 



Functional Category 


FIPS 


asp 


ADR 


DDE 


ORA 


System Stand. Schema 


3 


3 


3 




1 




0 


Entity-types 
Relationship- types 
Attribute- types 


(10) 

(J?l 


Jli 


(20) 
(10) 
(50 + ) 




[2) 

(?) 




[?) 

ill 


DA/User Extensible 


3 


3 


3 


0 


0 


Cate*gory Subtotals 


6 


6 


6 


1 


0 



Second, the data dictionary should provide a command 
language that will support gueries from users while 
reserving some capabilities solely for the use of the 
dictionary administrator. This last ingredient supports 
security and data integrity. Again, as seen in Table 13, 
DAT A DICTION ART and DATAMANAGER provide ail capabilities of 
the FIPS standard while the other two lag behind. 
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TABLE 13 

Category Two: Command/Query Languages 



Functional Category 


FIPS 


asp , 


ADR j 


DDE 


ORA 


CMD Interface Lang. 


3 


3 


3 


3 


2 


Query Commands 


3 


3 , 


3 


1 j 


1 


DA-Only Commands 


3 


3 


3 


3 '' 


2 


Category Subtotals 


Q 


9 


9 


4 


5 






Third, the ideal data dictionary must be relatively easy 
to use, yet still powerful enough to support the experienced 
user. One of the major ingredients of user-friendliness is 
a menu-driven (or panel-driven) format. Good, easy-to- 
understand examples are another important aid to the new 
user. Table 1 '4 reveals that, in our opinion, none of the 
four systems ran be considered easy to use. Looking at the 
four as a group, two fail to use menus, one provides exam- 
ples which are complex and hard to understand, and the 
fourth fails to provide either menus or good examples. 

Fourth, security is one of the primary objectives of a 
data dictionary. It should not only be abLe to control 
general access to the system, but should also support the 
capability to provide different levels of access to 
different users. In Table 15, three of the four, 
DAT ADICTIONARI, DATAMANAGER, and ORACLE receive high marks 
for providing both aspects of security. Security for infor- 
mation contained within DATA DESIGNER must be provided by 
the parent DB?1S. 

Fifth, the clearness and logical layout of system docu- 
mentation should be considered. Additionally, the reports 



85 



TABLE 14 

Category Three: Relative Ease of Use 



Functional Category 


FIPS 


.ISP 


ADR 


DDE 


ORA 


Menu-Driven 


3 


0 


3 


3 


0 


New User Friendly 


3 


1 


2 


3 


2 


Good Setup Example 
in Documentation 


3 


1 


2 


3 


3 


Category Subtotals 


9 


2 


7 


5 


C 



TABLE 15 

Category Four: Security 



Functional Category 


FIPS 


.ISP 


ADR 


DDE 


ORA 


Access Control 
(Password) 


3 


3 


3 


1 


2 


Degrees of Access 
(Levels) 


3 


3 


3 


1 


3 


DA-only Privileges 


3 


3 


3 


2 


3 


Category Subtotals 


9 


9 , 


9 


4 


8 



and the documentation prepared by the data dictionary must 
be evaluated for usability. As indicated in Table 16, each 
of the four lata dictionaries approaches that of our ideal 
FIPS standard. It is interesting to note that the two 
frontrunners, DAI A DICTION ARY and DATAMANAGE3, have some 
problems with documentation complexity. 
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I&BLE 16 

Category Five: Documentation and Reports 



Functional Category 


FIPS 


MSP 


ADR 


DDE 


ORA 


SYS Documentation 
clear/laid out well 


3 


2 


2 


3 


3 


Good Examples of 
Report Types 


3 


2 


3 


3 , 


3 


Reports Readable 


3 


3 


3 


3 


3 


Category Subtotals 


9 


7 


8 


9 ' 


9 



Finally, the ideal data dictionary should support a 
variety of applications, interfacing with both DBMS and 
programming languages. DATADESI5NER and DATAMANAGER both 
provide interfaces to one or more DBMS and to two or more 
programming languages. Table 17 pertains. While DATA 
DESIGNEE and ORACLE only interact with their system DBMS, 
DATAMANAGER provides flexibility and versatility by 
supporting several popular DBMS. 



TABLE 17 

Category Six: Application Interfaces 



Functional Category 


FIPS 


MSP 


ADR 


DDE 


ORA 


DBMS Interface (s) 


3 


3 


2 


1 


1 


Language Interfaces 


3 


3 


3 


1 


1 


Category Subtotals 


6 1 


6 


5 


2 1 


2 
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When total "scores" are calculate!, the results are as 
shown in Table 18. While none of the systems provides all 
of the characteristics of the iieal/FIPS system, ADR 
DATA DICTIONARI and MSP DAIAMANAGEE come the closest. If an 
organization were starting ’’fresh", with no previous invest- 



TABLE 18 

Data Dictionary Comparison Totals 



Functional Category 


FIPS 


MSP 


ADR 


DDE 


ORA 


Schemas/Extensible 


6 


6 


6 


1 


0 


Command/2uery Lang. 


9 


9 


9 


4 


5 


Ease-of-Ose 


9 


2 


7 


5 


5 


Security 


9 


9 


9 


4 


8 


Documentation/R pts 


9 


7 


8 


9 


9 


Application Inter. 


6 


6 


5 


2 , 


2 


Comparison Totals 


48 


39 


44 


25 


29 



ment in software, the ADR family of products, RIME, warrants 
serious consileration. If, on the other hand, the organiza- 
tion already has one of the popular DBMS, and is simply 
seeking to add a new, or better, data dictionary, the free- 
standing DATAMARA3ER might very well satisfy the need. In 
each of these two excellent commercial packages, the 
observed shortcomings lie in the areas of user friendliness 
and clear examples for new users. Although important 
requirements, these faults will be overcome as the users 
gain experience. 

In the case of the other two dictionaries, their short- 
comings would be far harder to forgive. Their problems lie 
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in areas of standard schemas, extensibility, security, etc. 
Each seems more user-friendly, bit, since they do less, 
there are fewer procedures to be explained. DMA DESIGNER, 
although an interesting package, simply does not provide 
several of the primary characteristics that we expect to 
find in an ideal data dictionary. ORACLE is certainly the 
weakest of the four dictionaries we evaluated. As part of 
the ORACLE DBMS, this system does provide some data 
dictionary features. However, it is not the full-featured 
data dictionary we would recommend. 
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VI. EXPANSIONS OF THE ROLE OF DMA DICTID ¥ ARI ES 



In this chapter we will sugges 
of the data dictionary can be expan 
discussed in previous chapters. ff 
the data dictionary can enforc 
increasingly common distributed dat 
Then we will show how the process 
supported through the use of a data 
sion, we will attempt to foresee wh 
nology will lead information res 
years to come. 



A. DISTRIBUTED DATA PROCESSING 





Our dis 


CUSSiDQ 


of database 


5 


ce 


ntered around the 


assumption th 


at 


ce 


ntralized 


database 


, with cent 


ra 


an 


d control. 


that wo 


uld be access 


el 


ma 


ny organi 


zations 


have decided 




po 


wer to 


various 


departments 




da 


pending on 


the org 


anization's s 


tr 


at 


ion, it i 


s also 


likely that the 


wi 


11 have to 


be distributed. A 


i 



consistent, logically interrelated 
at dispersed locations" [Ref. 28]. 
tions, called nodes, are connect 
which allows the nodes to communica 
Many factocs have contributed 
larity of distributed processing, 
are the following: [Ref. 29] 



t ways in which the cole 
ded beyond the basic uses 
e will look first at how 
e standards in today’s 
a processing environment, 
of decision making can be 
dictionary. In conclu- 
ere data dictionary tech- 
ource management in the 



up to this point has 





an 


or 


ganiza 


t io 


n has 


one 


li 


ze 


d databas 


e m 


anagem 


ent 


b 


y 


all 


users 


• 


Howev 


ec. 


to 




dis 


tribu t 


e 


comput 


ing 


an 


d/or 


outly 


ing 


sit 


es. 



ucture. In such a situ- 
organization’ s database 
istributed database is "a 
collection of data stored 
These dispersed loca- 
ed by means of a network 
te. 

to the increasing popu- 
Two of the most important 



90 



1 



. Numerous advances in technology that have provided 
more powerful processing hardwire at lower cost and 
improved communication and network capabilities. 

2. The need for faster and easier access to time- 
critical information to assist in the decision 
making of organizations with geographically 

dispersed components requiring unified information 
sharing and processing. (This concept will be 
discussed in detail in the next section. ) 

For organizations that employ a centralized approach to 
control widely- dispersed# autonomous divisions, an attempt 
to adhere to the traditional concepts of centralized infor- 
mation resources may be ineffective and self-defeating. 
These organizations might be tempted to sacrifice the 
ability to better satisfy user needs in order to preserve 
control and traditional relationships. Fortunately, 

managers are rapidly becoming aware of the many potential 
advantages of distributing some, "or all, of the organiza- 
tion’s data processing functions to the user level. 
Technological advances continue to encourage these changes 
because 



The availability of major computing resources in small, 
low-cost packages allows the dedication and distribution 
of needed capabilities, either standing alone or inter- 
connected, when and where they are needed. dany of the 
complexities of centralized large-scale computing facil- 
ities are no longer necessary. * Ref. 30] 



It is important to remember, however, that 



the complexities of integrated systems require digital 
data communications, appropriate software, and extensive 
planning and coordination. These complexities should 
not be underestimated. [Ref. 30] 



One very successful corporation, Hewlett-Packard, 
utilizes a combination of centralized, decentralized, and 
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capabilites of those operating solely in a centralized envi- 
ronment. However, a distributed data dictionary must 
support three specialized functions in addition to basic 
data dictionary functions: 

1. the ability to locate data witain the network 

2. the coordination/management of distributed data 

3. the ability to perform data transformation in 
support of user applications 

The distributed data dictionary’s iirectqry function enables 
it to identify which network node contains the specific 
information that is needed. Whether the particular database 
is distributed by replication or partitioning, the data 
dictionary must provide information about its logical and 
physical characteristics. 

In the case of replicated data, where functionally iden- 
tical copies of the data are stored at multiple nodes in 
the network, the distributed 3D/03 r data iictionary] 
must have knowledge of the known redundancies throughout 
the network. Synchronization of updates in this case is 
critical. *Ref. 33] 

In a partitioned database, where only certain portions of 
the database are located at individual nodes, the data 
dictionary's role becomes even more important because "it 
must know the relationships among the pieces, and be able to 
manage all the parts, such that this physical dispersion of 
the data is transparent to the user" [Ref. 3%]. Finally, 
the distributed data dictionary may be required to perform 
transformation of data to support various users. If serving 
a heterogeneous network — one in which dissimilar types of 
hardware and software coexist — the data dictionary will have 
to translate between different data and storage structures. 



The distributed DD/DS r data dictionary] can facilitate 
these translation processes by providing the metadata 
mappings to allow the source to be transformed into tne 
target data. This is accomplished by storing in the 
data dictionary the source and target metadata descrip- 
tions to be used by the mapping process. [Ref. 35] 
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It is possible for the distrioutlon o£ dictionary capa- 
bilities to be accomplished by several alternative configu- 
rations. One possible configuration, as mentioned earlier, 
involves duplicating the data dictionary in its entirety at 
each node of the network. An example of this is shown as 
Figure 6.1. (Dashed lines indicate node-to-node communica- 
tions and dotted lines indicate dictionary-to-dict ionary 



| Network Node | | Network node | - 

| DATA DICTIONARY 1 | DATA DICTIONARY | 



| DATA DICTIONARY | | DATA DICTIONARY | 

I Network Node I 1 Network node I 



Figure 6.1 Duplicated Data Dictionaries 



communications.) Each data dictionary will contain a 
complete copy of the entire organization’s metadata. while 
the nodes themselves will interact frequently, the various 
copies of the dictionary will not. However, when one copy 
of the dictionary is updated, all other copies must be auto- 
matically updated if data integrity is to be maintained. 
This duplication of metadata will result in some degree of 
additional overhead, but it will improve the responsiveness 
of the system and minimize the necessity of inter-data 
dictionary queries. In some implementations, communication 
costs can be significantly reduced. This configuration will 
be most desirable in cases in which the organization’s data- 
base (s) are also duplicated at each node or if nodes would 
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be likely to access each other's metadata often. A stable 
organization with well-established data processing, where 
metadata is not continuously being updated, would benefit 
most from this configuration. 

In the second configuration , the data dictionary is 
partitioned among the various network nodes. As shown in 
Figure 6.2, each node contains only that portion of the 
dictionary that contains the metadata it reguices. No one 
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Figure 6.2 Partitioned Data Dictionary (DD) 

node or station within the system will have a complete data 
dictionary. This configuration would be used when there is 
not much need for the nodes of the network to access each 
other's metadata and there is a relatively clear-cut. differ- 
entiation between the functions being carried on at each 
node, which implies different metadata. Because redundancy 
is kept to an absolute minimum, problems could arise if a 
node's data dictionary partition were lost unless good 
backup procedures were in effect. Since each node is only 
responsible for maintaining its own portion of the whole, 
there is littLe update overhead and thus little system delay 
as long as the required metadata exists at that particular 
node . 
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In the final conf igura tion, the data dictionary is 
distributed ia a hierarchical structure. Thera will be one 
'’master” copy of the dictionary and one or more partial 
copies throughout the network, as shown in Figure 6.3. In 
this configuration, each node that contains a portion of the 



-J 


Network Node 


1- 


1 


DATA DICTIDNAR? 


J 



1 DD Partition | 1 DD Partition | j DD Partition | , 
— 1 Network node | — 1 Network Node |--| Network Node | — 



Figure 6.3 Hierarchy of Distributed Data Dictionaries 

data dictionary is responsible for updating the master 
dictionary whenever its portion is modified. This structure 
ensures data integrity and provides flexibility by allowing 
varying amounts of metadata to be iistributed. Another use 
for this hierarchical structure might be to separate func- 
tionality within a network, e.g., database, automated 
office, and programming functions. Each of these functions 
is able to maintain its portion of the dictionary locally 
while one master copy is available to handle inter-partition 
queries. 

There are presently several commerical packages in the 
development or testing stages that will be able to satisfy 
the requirements of distributed processing. One system that 
is already available and being used in numerous applications 
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is ADR’s Relational Information Management Environment 
(RIME) system. As discussed in Chapter V, this system 
features fourteen separate components that can oe integrated 
into one "total" system. One component, D-NET, combines a 
database, data dictionary, and communications interfaces to 
support the special requirements of distributed processing. 
D-NET is capable of supporting both homogeneous and hetero- 
geneous networks: 

The flexibility provided by D-NET and the other software 
components allows users to configure the distributed 
system networks based on the needs of each node. 
Various operating systems, computer types, and cooper- 
ating software products can be used to create a specific 
environment without impacting application development 
and operations. [Ref. 36] 

D-NET can implement the system's data dictionary, 
DAIADICTIONARI, as either one centralized dictionary or as 
multiple copies stored at remote locations. Similarly, 
RIME's database, DAI AC0M/D8 , can be maintained either at one 
centralized location or distriouted to various nodes 
throughout the network. D-NET serves as the basis of the 
Army's project VIABLE, providing numerous benefits that 
include cost effectiveness, highly expandable, increased 
productivity, resource control and synchronization, and 
independent operation at the local user's level. 

B. DECISION-MAKING 

In this section we will show how the data dictionary 
provides managers with the efficiently recorded, accurate, 
and timely information necessary to make decisions in conso- 
nance with the goals of the organization, whether in a 
centralized or distributed environment. According to the 
report of the Sommittee on Review of Navy Long-Range ADP 
Planning, "information technology", which includes data 
dictionaries, is 
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critical to the Navy's ability to fulfill its wartime 
and peacetime roles in an optimum mainer. The available 
technologies would enable the Navy to approach its 
missions with information and data that (1) ' have been 

collected and recorded simply, (2) have improved accu- 
racy, (3) have been speedily raportei, collated, and 
distributed. (4) lead to summaries that are timely and 
to the point, as and when needed, and (5) nave enabled 
both manpower commitments and costs to be reduced. 
[Ref. 37] 



1 • The Decision -Ma king Process 

Hertect Simon's classic model of the decision-making 
process, as cited by Sprague and Sarlson [Ref. 38], consists 
of three distinct steps: intelligence, design, and choice. 
The use of a data dictionary supports the decision maker as 
he takes each step. 

a. I n telli pen ce involves searching the environment 
for conditions calling for decisions. Raw data must be 
obtained, processed, and examined for clues taat may iden- 
tify problems. However, so much data is available within an 
organization that a seemingly infinite parade of information 
can be produced--this situation is called information over- 
load. There must be some way of narrowing down the amount 
of information that is presented to the decision maker. A 
data dictionary used in conjunction with a database can play 
an important cole in this narrowing process. As discussed 
earlier in the thesis, the dictionary helps an organization 
identify and eliminate redundant data. Its query language 
can be used to select infomation about a particular entity 
and its report definition capability can be used to generate 
aggregate, rather than detailed data. Relationships betwen 
entities are easily identified so that managers* questions 
such as "What is the range of values for 'Readiness Status' 
data?" and "Which departments receive the 'Ammunition 
Transaction' report?" can be answeced. 
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b. Design entails inventing, developing, and 
analyzing possible courses of action. Ills involves 
processes to understand the problem, to generate solutions, 
and to test solutions for feasibility. The data dictionary 
plays a key role in documenting tie decision maker’s envi- 
ronment so that he or she will have a centralized source of 
information from which to develop possible choices. The 
dictionary can also be used to tailor information to meet 
specific needs by defining user views of data and 
restricting user access to certain data. In this way, users 
can be presented only with the information they are supposed 
to have and need to have, as determined by higher authority 
in the organization, instead of having to deal with non- 
essential information. 

In addition to recording information about the 
plans, structure, and functions of the organization, the 
data dictionary can also be used to record information about 
the decision makers themselves. In the case of the U.S.S. 
Constellation, for example, information about tie commanding 
officer and the key elements of his environment can be docu- 
mented: which decisions he wishes to make and which ones 
his subordinates will make, the mission assigned to the 
carrier by the C.O. *s superiors, the relative priorities he 
attaches to various subjects, his short term and long term 
pecsonal goals, previous decisions he has made, and so on. 

c . Choice involves selecting a particular course of 
action from those available and implementing that choice. 
Of course, the ultimate decision will lie with the decision 
maker, and not with the data dictionary. At best, the data 
dictionary can present options to the decision maker and, 
once the choice is made, can document the steps taken to 
implement that choice. 



99 



2 



• Crisis Manag eme nt 

The accuracy and timeliness of information provided 
to the decision maker becomes of critical importance when 
the decision-making process occurs during a crisis situ- 
ation. In wartime, for example, there is usually a great 
deal of risk associated with a decision: many decision 
makers are involved, information must.be consolidated from a 
variety of sources and locations, little time is available 
to make decisions, and, due to the uniqueness of events, 
there is often no pre-defined structure for making the deci- 
sion. There are four ways that the data dictionary can 
prove especially helpful in crisis decision-making. 

a. The dictionary speeds up the information- 
gathering process. As discussed earlier, user views and 
accesses have been pre-defined and can be changed easily as 
needed. Active data dictionaries provide for automatic 
update of any changes that are made, so information is 
always, current. 

b. The dictionary prioritizes information. The 
priorities of the organization and the decision makers are 
taken into account and can be updated as events occur. In 
this way, the attention of decision makers is focused on 
truly important information rather than dispersed over a 
wide range of information. 

c. The dictionary provides a common information 
base. This is important when many decision makers at 
different locations are involved. All participants have the 
latest information and can also take advantage of the 
"corporate memory" provided by the dictionary. 

d. In short, the dictionary provides "intelligent" 
information management. It reduces information overload, 
tailors information to specific decision-makers’ needs, and 
responds well to infrequent, ad hoc requests. It helps to 
establish relationships between events as they occur. 



100 



The typical, or even "ideal", data dirtionary will 
not be able to fully support the decision-making process 
without the help of additional sophisticated software to 
tace advantage of its capabilities. He believe that as the 
acceptance and use of the data dictionary as a tool for 
information resource management become widespread, the 
denand for an expanded cole for the dictionary will 
increase. Organizations must become more accomplished in 
the top-down planning process of the system development life 
cycle in order to receive maximum benefits from data 
dictionary technology. 

C. CONCLOSIONS 

In this thesis, we have discussed the structure, func- 
tions, and objectives of a data dictionary. He have 
compared popular commercial products to an "ideal" 
dictionary based on criteria we developed and on FIPS DDS 
guidelines. We have analyzed the role of a data dictionary 
in information resource management, including its support of 
a distributed data processing environment and of the 
decision-making process. It seems clear that as organiza- 
tions become cognizant of the need to manage tieir informa- 
tion efficiently, the importance and necessity of data 
dictionary implementation will continue to increase. 

Designers of data dictionaries are aware of these trends 
and are moving in the following directions: 

First, toward what is known as an integrated data 
dictionary and second, toward a free-standing dictionary 
that serves as a driver of a distributed data processing 
system made up of several types of computers, data base 
management systems, file managers, and text editors. 
;Ref. 39] 

In reference to the first projection, several commercial 
systems have been developed that feature integration of a 
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data dictionary with a database. Dne exampLe of this is 
ADR's RIME which features integration of a database and a 
data dictionary with numerous other components to form one 
very capable and flexible system. Addressing the second 
projection, Rullo [Bef. 40] foresees development of a "super 
data dictionary" to support future integrated and distrib- 
uted systems: 

In this environment, the data dictionary would act as a 
driver of the system. The data diet ionary/d ata direc- 
tory might also have some integrated facilities permit- 
ting transfer of data among other system software 
functions including itself. Tnere is a trend in this 
direction, with other systems depending on the data 
dictionary/data directory and that system itself begin- 
ning to resemble a model of the enterprise. 

We believe the future holds significant improvements and 
expansions of data dictionary technology. It is important 
that the development of standards for data dictionary 
compatibility continue along with the development of stan- 
dards that are currently being developed to support network 
communications. It is conceivable that these standards, if 
widely accepted, would allow any data dictionary to "talk" 
to another and to exchange information. The FIPS DDS stan- 
dards developed by the National 3ureau of Standards will 
most likely become the basis for data dictionaries procured 
and used by the federal government. 

We also foresee the use of fourth generation languages, 
the extremely user-friendly, "close- to-natur al-language" 
languages that will facilitate user access to the diction- 
ary's metadata. These languages will replace the formal 
command languages and awkward syntax described earlier in 
the thesis. Another factor contributing to the increased 
utility of data dictionaries will be the use of sophisti- 
cated softwace and artificial intelligence techniques in 
conjunction with the dictionary. As the central source of 
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data, about an organization, the lata dictionar 
broad base of information upon wnicn in artifio 
gence "expert" system can be built. For exam 
possible that an expert system would be able t 
validate additions to the dictionary schema ba 
determined rules and information gained from pre 
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associations between the contents of the data di 
flag them for the attention of the decision make 
tion, a "smart" data dictionary would be able 
that every time a user logs on to the system, 
particular information, so that eventually, 
dictionary will provide it for him automatically 
No matter what changes occur in data diet 
nology, the data dictionary's role in the effic 
meat of an organization's information resource w 
to be an increasingly important one. The die 
support the organization in its planning and 
functions, its development of information system 
tenance of those systems, and the intelligent 
systems. We believe that the military will so 
vast market for data dictionary software an 
demands of its users will drive data dictionar 
even further. 
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APPENDIX A 
BACKUS-NAUR FORM 

Backus-Naur farm is a graphic notati 
tha syntax of a language. It is use 
Inf o£Sa.tion Processing Standard foe Data 
(FIPS DDS) to show the format of tae comma 
ulate the dictionary. Tha following are 
symbols used by the FIPS DDS: 
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[WHERE [ATTRIBUTE | A} [FOR] <attribut 
[,..., [ attribute-clause-n ]] ] 

WITH SECURITY <security-clause> ]} 

It indicates that there ara several differ 
an entity to the dictionary. At a minimum 
include ENTITY-TYPE or E-T, an entity-type 
and a name clause. The words OF and IS ar 
tha last two phrases set off by brackets, 
are used, tha same rules hold foe choosi 
them . 
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